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PREFACE 


This  multi-authored  AGARDograph  represents  the  preliminary  survey  of  the  Working  Group  (AMP-WG-08). 
“Evaluation  of  Methods  to  Assess  Workload”  was  initiated  oy  the  AGARD  Aerospace  Medical  Pnnel  in  January  1977, 
following  approval  by  the  National  Delegates  Board  (NDB)  in  the  fall  of  1976.  Working  Group  meetings  were  held  at 
Cologne  (April  1977),  London  (October  1977),  Fort  Rucker,  Alabama  (May  1978),  and  Paris  (November  1978), 
concurrent  with  symposia  conducted  by  the  Aerospace  Medical  Panel.  Early  meetings  focused,  as  would  be  expected, 
on  the  scope  of  the  task.  While  it  was  evident  that  the  broad  outline  could  be  described  with  a  high  degree  of  agreement, 
it  was  also  apparent  that  tasking  individual  members  with  sub-areas  would  require  that  they  prepare  manuscripts  de  novo, 
a  level  of  effort  clearly  not  desired,  given  the  substantial  burden  each  of  them  already  had  in  his  own  laboratory.  It  was 
therefore  deckled  at  Fort  Rucker  to  seek  contributed  chapters  from  Working  Group  members  and  others  in  the  NATO 
scientific  community  who  had  on  hand  materials  which  could  be  readily  adapted  to  the  objectives  of  the  Working  Group. 
As  the  reader  will  see,  numerous  *•  attributions  were  received.  The  editors  feel  that  the  result  is  a  wide-ranging 
compendium  of  workload  measurement  methodology,  though  most  certainly  some  methods  have  been  either  missed  or 
are  under  represented . 

The  objectives  and  scope  of  the  effort,  as  approved  by  the  NDB,  were  as  follows: 

OBJECTIVES:  Military  aircraft  are  becoming  increasingly  complex,  the  associated  avionics  systems 

more  sophisticated,  and  the  mission  profiles  more  demanding.  The  objective  of  the 
Working  Group  is  to  study  if  such  an  increase  in  aircrew  workload  has  become  a 
limiting  factor  in  the  operational  employment  of  some  aircraft  and  to  determine 
valuable  methods  to  evaluate  this  workload. 

SCOPE  OF  WORK:  The  measurement  domain  will  be  broken  down  into  sensory  threshold  function 
tests,  motor  function,  and  responses  to  psycho,  physio,  and  chemical  excitation. 

The  methodology  win  include  a  wide  range  of  instrumentation,  laboratories, 
inflight  measurement  and  modelling  methods,  with  the  goal  of  compiling 
systematically  and  evaluating  the  multiplicity  of  approaches  and  techniques 
implied. 


LIST  OF  CONT  RIBUTORS 


Richard  A.  Albanese 

USAF  School  of  Aerospace  Medicine 

Brooks  AFB,  Texas,  78235,  USA 

Jackson  Bestiy 
Department  of  Psychology 
University  of  California  at  Los  Angeles 
Los  Angeles,  California,  90024 

Clyde  A.  Brictson 
Anthony  P.  Ciavarelli 
Dunlap  and  Associated,  Inc. 

Western  Division 

7765  Girard  Ave.,  Suite  04 

La  Jolla,  California,  92037 

Edward  P.  Buckley 
William  F.  O’Connor 
Tom  Beebe 

Federal  Aviation  Agency 

National  Aviation  Facilities  Experimental  Center 
Atlantic  City,  New  Jersey,  08405 

R.  Cannings 

Royal  Air  Force  Institute  of  Aviation  Medicine 
Famborough  GUI 4  6SZ 
Hants,  (JK 

Billy  M.  Crawford 
Systems  Research  Branch 
Human  Engineering  Division 
Wright-Patterson  AFB,  Ohio,  45433 

Walter  B.  Gartner 

Miles  R.  Murphy 

333  Ravens  Wood 

Menlo  Park,  California,  94025 

G.H.  Lawrence 

Office  of  Naval  Research 

Physiology  Program  (Code  441) 

800  North  Quincy  Street  -  Room  433 
Arlington,  Virginia,  22217 

Richard  E.  McKenzie 
Edward  P.  Buckley 
Kiriako  Sarlanis 
Federal  Aviation  .Agency 

National  Aviation  Facilities  Experimental  Center 
Atlantic  City,  New  Jersey,  08405 


Richard  E.  McKenzie 

Bryce  O.  Hartman 

Crew  Technology  Division 

USAF  School  of  Aerospace  Medicine  (AFSC) 

Brooks  AFB,  Texas,  78235,  USA 

Carl  E.  Melton 

Aviation  Physiology  Laboratory 
Civil  Aeromedical  institute 
FAA  Aeronautical  Center 
Oklahoma  City,  Oklahoma,  73125 

Layn'.  P.  Perelli 

Crew  Technology  Division 

USAF  School  of  Aerospace  Medicine  (AFSC) 

Brooks  AFB,  Texas,  78235,  USA 

Alan  H.  Roscoe 

Royal  Aircraft  Establishment 

Bedford,  UK 

Prof.  Gaetano  Rotondo 

Italian  Air  Force  Medical  Service  HQ 

Via  P.  Gobetti  2  -  00185  Rome,  Italy 

R.  Simmons 
M.  Sanders 
K.  Kimball 

US  Army  Aeiomedical  Research  Laboratory 
Fort  Rucker,  Alabama,  36362,  US/v 

Walter  W.  Wierwille 
Robert  C.  Williges 

Virginia  Polytechnic  Institute  and  State  University 
Blacksburg,  Virginia,  24060,  USA 

Samuel  G.  Schiflett 

Naval  Air  Test  Center 

Patuxent  River,  Maryland,  2067C,  USA 


_V 

CONTENTS  I 


PREFACE 


LIST  OF  CONTRIBUTORS 


INTRODUCTION 

-C 

Chapter  1  CONCEPTS  OF  WORKLOAD  ;  \ 

by  W.B.Gartner  and  M.R.Miuphy 

Chapter  2  ''CONCEPTS  OF  FATIGUE;  ' 

by  W.B.Gartner  and  NLR.Murphy 

Chapter  3  ^  CONCEPTS  OF  STRESS  -  \ 

by  R.E.McKenzie  '  -■ 

Chapter  4  ''  SOME  CONSIDERATIONS  CONCERNING  METHODS  TO  EVALUATE 

AND  ASSESS  WORKLOAD  IN  AIRCRAFT  PILOTS  ;  \ 
by  G.Rotondo 

Chapters  ^PHYSIOLOGIC  ASPECTS  OF  WORKLOAD/FATIGUE/STRESS;  ) 
by  L.P.PereUi  .  y 

Chapter  6  ^  SOME  INSIGHTS  RELATIVE  TO  THE  MAN-MACHINE  SYSTEM - 

AN  OVERVIEW  OF  TEN  YEARS  OF  RESEARCH;  } 
by  R.E.McKenzie  and  B.O.Hartman 

Chapter  7  V  AIRCREW  WORKLOAD  ASSESSM  ENT  TECHNIQUES  J  •«, 
by  W.W.Wierwille,  R.C.WiUiges  and  S.G.Schiflett 

Chapter  8  *  WORKLOAD  ASSESSMENT  METHODOLOGY  DEVELOPMENT;  \ 

by  B.M.Crawford 

Chapter  9  ^  QUANTITATIVE  MILITARY  WORKLOAD  ANAL  YSIS  \  > 

by  R.A.Albanese 

Chapter  10  ^  VISUAL  PERFORMANCES  A  METHOD  TO  ASSESS  WORKLOAD 


IN  THE  FLIGHT  ENVIRONMENT 


by  K. Simmons,  M. Sanders 


!NT;j 

tndK.Ki 


Kimball 


Chapter  1 1  •+  HANDLING  QUALITIES,  WORKLOAD,  AND  HEART  RATE  -  ) 
by  A.H.Roscoe  ' 

Chapter  12  ^  BRAIN  WAVES  AND  THE  ENHANCEMENT  OF  PILOT  PERFORMANCE  ;  \ 
by  G.H.  Lawrence 

Chapter  13  ^  PUPILLOMETRIC  MEFHODS  OF  WORKLOAD  EVALUATION. 

PRESENT  STATUS  AND  FUTURE  POSSIBILITIES  j  \ 
by  J.Beatty 

Chapter  14  ^'  AIRCREW  PERFORMANCE  RESEARCH  OPPORTUNITIES  USING 
THE  AIR  COMBAT  MANEUVERING  RANGE  (ACMR)  •  , 
by  C.A.Brictson  and  A.P.Ciavarelli  ■  J 

Chapter  15  ^  SPEECH  PATTERNS  AND  AIRCREW  WORKLOAD  :  \ 
by  R.Cannings 

Chapter  16  ^  AN  EXPLORATORY  STUDY  OF  PSYCHOPHYSIOLOGICAL 

MEASUREMENT  AS  INDICATORS  OF  AIR  TRAFFIC  CONTROL 

SECTOR  WORKLOAD; - _ 

by  R.E.McKenzie, 'fe.P.Buckley  and  K.Sarlanis 


Paje 

iii 

iv 
vii 

1 

3 

7 

II 

13 

17 

19 

55 

69 

73 

83 

93 

103 

111 

115 

129 


v 


INTRODUCTION 


Task  complexity  is  everywhere  in  the  environment  within  the  operational  pilot  functions:  avionics  systems, 
commonly  with  a  digital  computer  core  and  a  wide  range  of  sensors  and  information  displays;  a  cockpit  packed 
with  flight  displays  and  controls;  capabilities  and,  at  time*,  requirements  for  multiple  missions,  confrontation  with  a 
variety  of  threat  systems;  crowded  airspace;  multiple  command/control/target  designation  systems  and  techniques; 
a  host  of  environmental  burdens,  inside  and  outside  the  cockpit.  In  addition,  the  NATO  nations  have  seen  the 
emergence  of  multi-role  aircraft  and  an  expansion  in  the  tactical  tmplcyment  of  the  helicopter.  One  result  of  these 
technological  and  operational  advances  has  been  a  marked  increase  in  aircrew  workload.  This  increase  in  workload 
has  become  a  problem  of  operational  significance,  to  the  point  where,  in  some  cases,  aircrew  capability  has  become  a 
limiting  factor  in  the  operational  employment  of  some  aircraft  in  the  more  demanding  missions.  As  a  consequence, 
problems  of  aircrew  workload  have  assumed  increasing  importance  in  the  NATO  researcr  community. 

Methods  of  measuring  work.oad  have  a  substantial  history  in  the  NATO  researen  community.  Disciplines 
represented  include  systems  design  engineering,  operations  research,  the  behavioral  sciences,  aerospace  medicine, 
physiology,  biochemistry,  and  biotechnology  in  general.  There  has  been  considerable  variation  in  the  kinds  of 
experimental  tasks  employed,  the  measures  obtained,  the  instrumentation  used,  the  analytic  models  and  methods 
employed,  the  ratio  of  synthetic  modelling  versus  empirical  data  used,  and  tne  kinds  of  laboratory  facilities  required. 
The  measurement  domains  include  measures  of  sensory  threshold,  measures  of  sensory  integration,  cognitive  function 
tests,  measures  of  motor  function,  vigilance,  reaction  time,  psychophysiologic  responses,  physiologic  and  biochemical 
changes.  Methodology  includes  a  wide  range  of  instrumentation,  laboratory  facilities  and  environments,  inflight 
measurement  methods,  and  modelling  method?.  Analysis  models  and  experim  ntai  design  requirements  also  vary 
considerably.  Computer  utilization  in  the  areas  of  experimental  programming  and  data  processing  has  become  comm,  n- 
place.  Periodic  overviews  of  current  findings  are  necessary.  There  is  a  need  for  summary  matrices,  as  well  as  a  widely 
endorsed  taxonomy  of  human  performance. 

This  AGARDograph  is  one  such  periodic  overview.  It  is  current  in  the  sense  that  each  chapter  is  a  condensation  or 
modificant  of  recent  papers,  prepared  specifically  by  each  author  to  fit  the  objectives  of  this  Working  Group.  Ongoing 
research  involving  advances  in  workload  measurement  technology  obviously  cannot  be  represented  in  this  report,  since 
the  editors  avoided  tasking  contributors  with  the  preparation  of  chapters  “de  novo.”  Such  is  the  nature  of  “periodic 
overviews.” 

It  will  be  helpful  to  the  reader  to  have  a  “road-map”  of  this  report.  Diagrammaticaily,  it  looks  like  this: 
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CONCEPTS  OP  WORfLOAD* 

by 

Walter  B.  Gartner,  Ph.D. 

Hllea  R.  Murphy,  Ph.D. 

333  Ravena  Wood 
Menlo  Park  California  94025 

In  ordinary  uncritical  dlacourse,  the  phanonaua  referred  to  by  the  terms  "pilot  vorkload"  and  "fatigue" 
are  eaeily  dietlnguiohad.  In  lta  broadest  and  simplest  aspect,  pilot  workload  refers  to  how  much  a  pilot 
*ust  do  to  perform  a  specified  flight  operation.  Fatigue  la  widely  understood  as  a  feeling  of  tension  or 
weariness,  often  accompanied  by  an  obvious  unwillingness  or  Inability  to  continue  to  work  or  perform. 
However,  when  attempt*  are  aide  to  quantify  the  workload  Imposed  on  a  pilot  by  a  particular  aircraft 
doalgn,  or  operational  procedure,  or  to  access  the  effects  of  fatigue  upon  ays  test  performance,  important 
unreaelved  issues  arise  In  regard  to  che  mere  precise  specification  of  workload  and  fatigue  concepts  and 
to  the  adequacy  of  asses>ment  criteria  and  ccchnlquee.  This  chapter  and  the  ne>-.t  address  the  principle 
unresolved  issues  in  concoptuaii :ing  and  meaa  iring  pilot  workload  and  fatigue.  In  a  survey  of  the  origins 
of  operator  workload  concepts,  Jahns  (1)  has  found  it  useful  to  characterise  workload  as  "an  Integrative 
concept  for  evaluating  the  -ufects  on  the  human  oparator  antedated  with  multiple  stresses  occuring  within 
men-uechlne  environments."  Further,  he  proposed  to  partition  this  broad  conception  of  workload  under 
three  functionally  related  components:  .1)  input  load,  (2)  operator  effort,  and  (3)  work  result. 

While  broader  conceptions  mny  be  considered  useful  for  indicating  the  range  and  dlvetaity  of  workload 
reference,  the  '.urpose  here  is  to  outline  the  principle  w*yo  in  which  investigators  have  elected  to 
reetrlct  the  u-e  of  the  term.  Therefore,  we  will  discuss  Jahns'  basic  classification  scheme  with  only 
some  minor  changes  In  terminology. 

Workload  as  a  Set  of  Task  Demands:  The  coneon  attribute  of  task-demand  concents  of  workload  is  the  use 
of  the  term  to  refer  to  requirements  for  task  psrformance  which  can  be  specified  without  reference  to  any 
oper: .‘or  response  or  activity  actually  applied  to  satisfy  these  requirements.  The  distinction  between 
demands,  at  such,  and  any  actual  operator  response  Including  capabilities,  readiness  to  respond,  etc.,  is 
a  very  important  on,?.  One  approach  to  the  treatment  of  workload  ar  demand  is  exemplified  by  Klein's  (2) 
attempt  to  quantify  and  predict  "design-specific  Instantaneous  worklond  levels  imposed  upon  the  pilot 
while  in  flight."  In  distinguishing  this  approach  from  traditional  workload  quantification  methods,  Klein 
emphasized  that  "workload  is  addressed  from  the  standpoint  of  predicting  human  performance  requirements 
as  demanded  by  the  system  and  its  operational  environment  rather  than  from  the  standpoint  of  measurement 
of  human  responses  to  those  demands." 

The  application  of  task  analysis  techniques  within  a  designated  system-mission-environment  context  to 
determine  task  performance  requirements  is  a  familiar  and  widely  used  human  factors  practice.  Gartner, 
et.  al,  (3)  proposed  that  demands  be  more  strictly  defined  as  inputs  to  the  crew,  which  actually  serve 
(directly  or  indirectly),  to  establish  crew  performance  objectives  or  to  represent  operational  conditions 
and  events  which  in  an  actual  flight  situation  would  be  expected  to  initiate  crew  activity  or  modify 
ongoing  crew  responses.  Task  demands  are  identified  using  functional  criteria,  that  ia,  they  are  inputs 
that  operate  us  response  programs  or  as  action  requirements  for  the  crew.  This  distinction  between  response 
and  stimulus- oriented  expressions  of  task  demand  is  considered  to  underlie  some  of  the  problems  in  work¬ 
load  assessment  and  practical  application,  because  different  kinds  of  demands  are  often  confounded.  In 
other  words,  system- oriented  and  situation-specific  demands  are  often  confused  with  perceived  demands  by 
the  operator  >1  with  the  behaviorial  or  psychophyaiological  demands  Imposed  on  an  operator  by  an  assigned 
task.  This  task-demand  concept  is  closely  related  to  Jahns'  (1)  input  lead  component  which  he  defines 

operationally  as  "a  vector  (L)  of  input  data  which  must  be  transformed  by  the  operator  Into  a  vector  (P) 

of  output  ds.'.a  to  satisfy  a  specified  performance  criterion  function  and/or  maintain  a  homeostatic  operator 
state."  This  input,  load  characterization  of  task  demand  fits  a  vaiier.y  of  operator-loading  con  :pts  that 
distinguish  one  or  mo/e  sensory  channnels  or  modalities  as  important  to  task  performance,  and  addresses 
such  concerns  as  channel  capacity,  perceptual  overload,  and  so  forth.  For  example,  In  his  review  of  task 
load  factors,  Hartman  (4)  defined  load  as  "the  sum  of  all  requirements  imposed  on  the  operator  at  any 
instant  by  the  system,"  and  later  distinguished  load  as  the  number  of  information  channels  affecting 
operator  performance. 

The  defining  feature  of  demand  oriented  expressions  of  workload  is  simply  that  they  be  free  of  any 
dependence  upon  considerations  of  operator  response  or  response  capabilities.  In  view  of  the  apparent 
difficulty  in  sustaining  this  distinction,  in  practice,  it  is  probably  advisable  to  associate  task  demand 
only  with  input  or  stumulus-orlented  variables  and  to  reserve  workload  for  the  re3ponse-orlented  variables 

Workload  as  bffort:  The  focus  of  the  conceptualization  of  workload  as  effort  relates  to  how  much  a 

operator  has  to  do,  and/or  how  he  must  work  to  satisfy  a  specified  set  cf  demands.  A  general  character¬ 

ization  of  this  concept  of  workload  has  been  given  by  Cooper  and  Harper  (5):  'The  term  workload  is 
intended  to  convey  the  amount  of  effort  and  attention,  both  physical  and  mental  that  the  pilot  must 
provide  tc  attain  a  given  level  of  performance." 

A  somewhat  different  emphasis  is  provided  by  Welford  (6)  it  characterizing  effort  as  "the  intensity 
with  which  action  is  carried  out.  A  man  may  work  either  more  or  less  hard  at  a  job.''  Here,  the  emphasl.s 
shifts  from  ef fort  required  to  the  consideration  of  the  effort  a  human  operator  actually  does  exert  in 
the  performance  of  a  task. 


*  This  chapter  was  abstracted  by  the  editor  from  NASA  TN  D-8365,  Pilot  workload  and  fatigue:  a  critical 
survey  of  concepts  and  assessment  techniques  with  permission  of  the  authors. 
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In  his  elaboration  of  tho  operator-effort  component  of  workload,  Jahna  emphasizes  the  operator's 
readiness  to  respond  and  he  lc'entlfles  such  factors  as  experience,  motivation,  set,  physiological 
readiness  and  physical  factors,  as  well  as  the  general  background  and  personality  of  the  operator  as 
determinants  of  this  operator's  state. 

The  concept  of  effort  is  most  often  used  simply  to  refoer  to  how  hard  a  man  la  working  and  not  to 
the  actual  task  performance  or  to  the  difficulty  or  demands  of  the  taak.  Singleton  (7)  has  argued  for 
the  separation  of  performance  and  effort  bj  Invoking  the  familiar  observation  that  "an  operator  may  be 
performing  better  in  one  of  two  tasks  as  compared  in  an  experiment  because  he  is  trying  harder  rather 
than  because  one  task  is  easier  than  the  other.”  Whatever  it  is  that  occurs  when  a  man  is  working 
harder  is  referred  to  as  effort. 

Workload  as  Activity  or  Accomplltihment :  The  conceptualization  of  workload  as  activity  to  actual  task 
performance  or  the  products  of  this  activity.  It  is  often  used  in  operational  studies  of  the  effects  of 
operating  procedures  or  system  design  on  aircrew  performance.  In  a  summary  report  (8)  on  a  UAL-  ALPA 
Joint  project  to  evaluate  pilot  workload  the  authors  (who  are  not  named)  defined  workload  in  terms  of 
the  total  activity  of  the  captain  and  co-pilot  in  performing  such  tasks  as  flight-path  control,  vigilance, 
communications,  navigation,  and  system  operation  during  each  phase  of  s  actual  flight.  The  actual 
activities  engaged  in  by  crew  members  have  also  been  used  as  workload  referents  in  Jong-term  studies  of 
crew  performance  factors.  For  example,  Cantrell  and  Hartman  (9)  recorded  typical  flight-crew  activities 
over  20  consecutive  days,  including  off-duty  and  administrative  activities  as  well  as  those  carried  out 
in  flight,  to  be  used  as  an  index  of  workload. 

Workload  Assessment  Techniques:  A  critical  review  of  workload  assessment  techniques  Gartner  and  Murphy 
(10)  indicates  that  dispite  conceptual  and  practical  difficulties  the  attempt  to  develop  and  apply  useful 
measures  of  pilot  workload  is  being  vigorously  pursued.  The  workload  techniques  which  they  examined 
included  task-demand  analysis,  measures  of  task  performance,  psychophysiological  measures  and  subjective 
reports.  None  of  these  assessment  techniques  were  found  to  be  free  of  significant  limitations  in  their 
sensitivity  to  differences  in  task  difficulty,  in  distinguishing  between  physical  and  mental  effort,  or 
in  the  reliability  of  data  acquisition  and  interpretation  procedures. 

With  respect  to  workload,  Gartner  and  Murphy  recommend  that  significant  improvements  in  both 
measurement  ano  management  can  best  bs  accomplished  by  refinements  and  innovations  in  the  analysis  and 
measurement  of  pilot  effort.  They  state,  "human-factora  engineering  activities  are  already  being  applied 
to  task-demand  analysis,  and  effective  techniques  are  available  for  this  application."  However,  systematic 
attempts  to  assess  effort  per  se  are  considerable  less  in  evidence,  despite  the  fact  that  such  assess¬ 
ments  are  needed  for  the  empirical  evaluation  and  adjustment  of  task  demands.  Innovations  in  the  direct 
assessment  of  effort  would  aLso  provide  a  basis  for  developing  more  effective  "effort  control"  techniques. 
They  also  point  out  that  there  are  directly  ssaessible  neuron.usculur  tension  patterns  which  can  be 
reliably  related  to  both  central  neurophysiological  states  ano  the  task-relevant  phenomena  of  attention 
and  perception. 

In  summary  then,  ,t  can  be  seen  tha.  there  are  several  ways  of  conceptualizing  workload,  though  in 
general  they  might  be  JlviJed  into  an  emphasis  on  the  input  side  (task  demands)  or  the  output  side  (the 
work  output).  Similarly,  there  are  variations  in  the  appropriate  measurement  techniques  though  here  we 
see  no  obvious  simplification.  The  diversity  of  definitions  and  approaches  accounts  for  this  working 
group  (AMP  WG-08)  report,  and  is  a  condition  which  should  be  kept  in  mind  as  the  reader  proceeds  thru 
this  document. 
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CONCEPTS  OP  FATIGUE* 
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A  good  suaautry  statement  of  the  recurring  thane  that  the  central  difficulty  in  dealing  effectively 
with  the  problen  of  fatigue  is  one  of  definition  was  presented  by  Wciford  (1).  According  to  hla,  fatigue 
neana  a  subjective  state  following  aoae  kind  of  physical  or  mental  strain  In  ordinary  "man-in-the-street" 
conatructa.  However,  to  the  physiologist,  fatigue  sterna  sons  kl.  d  of  reduction  of  response  following 
more  or  lean  prolonged  activity.  However,  the  psychologist,  is  placed  In  the  middle  and  charged  with 
the  responsibility  of  tackling  the  problem  of  fatigue  relative  to  practical  human  Rffalre.  Unfortunately, 
according  to  Welford,  the  often  evades  thin  responsibility  by  dismissing  fatigue  as  unscientific  or  by 
redefining  the  phenomena. 

In  another  reference,  Welford  (2)  notes  that,  "difficulties  have  led  some  wish  to  abandon  the  term 
fatigue,  yet  there  Is  a  need  for  a  term  to  cover  those  changes  In  performance  which  take  place  over  a 
period  of  time  during  which  aoae  part  of  the  mechanism,  whether  sensory,  central,  or  nuacular,  becomes 
chronically  overloaded."  Bartley  (3)  develops  the  position  that  the  inherent  utility  of  the  concept 
will  be  realized  only  when  it  is  clearly  distinguished  from  such  considerations  as:  (1)  situation  in 
which  it  occurs,  (2)  tha  bodily  expression  of  fatigue,  and  (3)  the  effects  of  fatigue  on  performance, 
work  output,  and  ao  forth.  However,  it  will  be  apparent  in  the  following  overview  of  fatigue  concepts 
that  ouch  phenomenon  have  not  been  excluded  for  more  restrictive  definitions  of  fatigue,  and  that 
considerable  diversity  in  the  contemporary  use  of  the  term  remains.  Part  of  the  problem  areas  to  be 
related  to  the  wide  overlap  between  t be  concepts  of  workload  and  those  of  fatigue.  In  the  detailed 
overview  of  these  concepts,  Gartnor  and  Murphy  (4)  demonstrate  that  within  an  average  of  30.5  workload 
and  fatigue  indicators,  well  over  50X  of  the  indicators  are  directly  or  indirectly  related  or  overlapped 
to  a  significant  if  not  indistinguishable  degree. 

Fatigue  as  a  Feeling  of  Weariness  or  Tiredness:  This  conceptualization  of  fatigue  an  been  characterized 
by  Bartley  (3)  os  experimental  or  aenBory-cogitatlve.  Experimental  concepts  seen  to  be  favored  in 
operational  studies  of  fatigue  wherein  extensive  use  is  uade  of  subjective  assessments.  In  his  review 
of  operational  studies,  Schreuder  (5)  elaborates  on  the  subjective  aspects  of  fatigue  to  suggest  that, 

"The  ordinary  sense  of  weariness  which  the  pilot  subjectively  feels  after  a  hard  day's  work  should  not 
be  labeled  as  latlgue.”  Schreuder  would  insist  on  a  level  of  intensity  of  this  feeling  of  weariness 
"which  is  an  excess  of  the  expected  normal  fatigue  and  which  is  cumulative  and  of  such  amount  as  to 
alter  the  pilot's  judgement  and  ability."  Factor  analytic  studies  of  fatigue  indicate  that  the 
sensation  of  fatigue  has  three  major  components:  U)  bodily  tiredness  and  drowsiness,  (2)  weakened 
motivation  or  concentration  and,  (3)  a  group  of  physical  complaints,  not  unlike  those  of  psychosomatic 
disorders.  Other  investigators  are  satisfied  with  more  global  and  unqualified  definitions:  Yoshltake 
(6)  "The  feeling  of  fatigue  signifies  overall  unpleasantness  experienced  by  workoru  and  is  not  quite 
the  same  as  complaints  of  symptoms  of  fatigue." 

Fatigue  as  a  Clinical  Syndrome:  In  clinical  practice,  subjective  complaints  and/or  specific  sets  of 
signs  and  symptoms  are  regarded  as  useful  working  definitions  for  fatigue.  Mohler  (7)  has  outlined  an 
exte’  Jive  list,  of  signs  and  symptoms  for  both  physical  and  mental  fatigue,  with  the  physical  signs 
expressed  primarily  in  terms  of  physiological  functions,  i.e.,  increased  blood  glucose,  Increased  lag 
in  pu>  lllary  response,  instability  of  neuromuscular  coordination,  etc.  Mohler's  mental  symptoms  are 
expres.nl  io  terms  <  psychogenic  and  emotional  dysfunction  and  Include  Increased  irritability  and 
intolerance,  tendency  :.o  depression  and  withdrawal,  and  increased  sex  drive,  etc. 

Hartman  (8)  suggests  a  three-category  classification  of  fatigue  (acute,  cumulative  and  chronic) 
charactev lzing  acute  fatigue  as  that  normally  occurring  between  a  pair  of  sleep  periods,  and  cumulative 
fatigue  as  occurring  over  a  period  of  day  or  weeks  as  a  result  of  inadequate  recovery  from  successive 
periods  of  acute  fatigue,  Hartman  urges  a  clinical  definition  of  chronic  fatigue  as  "a  psychoneurotic 
syndrome  characterized  by  difficulty  in  committing  oneself  to  a  active  or  aggressive  course  of  action, 
and  by  a  generalized  withdrawal  or  retreat  from  conflict  which  is  intolerable  for  situational  or 
personality  reasons." 

Fatigue  as  Perfotmance  Decrement  or  Skill  Impairment:  Fatigue  concept  referents  in  this  category,  like 
the  clinical  signs  and  symptoms  just  cited,  are  often  treated  at.  indicators  or  effects  of  fatigue  rather 
than  a  distinguishable  state.  For  example,  Bartlett  (9)  states  "Fatigue  is  a  term  to  cover  all  those 
determinable  changes  in  the  expression  of  en  activity  which  can  be  traced  to  the  continuing  exercise  of 
that  activity  under  its  normal  operating  conditions,  and  which  can  be  shown  to  lead,  either  to  deteriora¬ 
tion  in  the  expression  of  that  activity,  or  more  3imply,  to  results  within  the  activity  that  are  not 
wanted." 

A  more  formal  expression  of  these  changes  in  performance  is  provided  by  Hull's  development  of  the 
reactive- inhibition  construct  (10).  Hull's  behavioral  restatement  of  Spearman's  general  law  of  fatigue 
and  Pavlov's  concept  of  conditioned  inhibition  is:  "Whenever  any  reaction  is  evoked  in  an  organism 
there  is  lift  a  condition  or  state  which  acts  as  a  primary,  negative  in  that  it  as  an  innate  capacity  to 
produce  a  cessation  of  the  activity  wn( oh  produced  the  state,  we  shall  call  this  state  or  condition 
reactive  inhibition11. 


>'  This  chapter  was  abstracted  by  the  editor  from  N.'.SA  TN  D-8365,  Pilot  workload  and  fatigue:  a  critical 
survey  of  concepts  and  assessment  technqiues  with  permission  of  the  authors. 
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On*  of  the  nor*  Interesting  variants  Is  Bartlett's  (11)  concept  of  "skirl  fatigue."  On  the  basis, 
of  studies  pilot  performance  in  the  Cambridge  psychological  laboratory  he  suggests  that  it  is,  "Necessary 
to  draw  a  broad  distinction  between  fatigue  produced  by  continued  hard  physical  work  and  that  produced 
by  work  which  calls  for  little  continuous  muscular  effort,  but  demande  persistent  concentration  and  a 
high  degree  of  skill."  Skill-fatigue,  also  distinguished  from  mental  fatigue,  is  said  to  occur  when  a 
task,  such  as  piloting  a  plane,  requires  complex,  coordinated,  and  accurately  timed  activities.  In  other 
Cambridge  studies,  deterioration  of  skill  performance  was  apparent  after  about  2lj  to  3  hours  of  stimulated 
flying, manifesting  primarily  as  a  progressive  lowering  of  etendarda  of  performance,  the  missing  Important 
Information  displays,  and  the  gross  mistiming  of  Interrelated  control  actions. 

Fatigue  as  a  Neurophysiological  Condition  or  State;  In  traditional  or  classical  studies,  fatigue  was 
leferred  to  a  particular  neuromuscular  site,  that  Is,  to  specific  motor  units  or  muscle  groups  or  organs 
or  tissue  structures  and  then  defined  in  terms  of  specific  biochemical  and/or  response  capability  changes. 
This  comparatively  narrow  focus  is  now  generally  recognised  as  only  one  aspect  of  fatigue.  In  a  dis¬ 
cussion  of  neuromuscular  fatigue  as  a  special  Instance  of  a  more  general  condition  as  Basmajlan  (12) 
points  out  "1  shall  observe,  at  once  the  traditional  and  necessary  warning  that  fatigue  Is  a  complex 
phenomenon  and  perhaps  a  complex  of  numerous  phenomenon.  The  fatigue  of  strenuous  effort  Is  probably 
quite  different  from  the  weariness  felt  after  a  long  day's  routine  sedentary  work.  Undoubtedly,  the 
following  types  exist:  emotional  fatigue,  central  nervous  system  fatigue,  general  fatigue,  and 
perpherlal  neuromuscular  fatigue  of  special  kinds." 

Uelford  (13)  feels  that  fatigue  is  best  conceptualized  as  a  local  neural  Impairment. 

While  Grand jean  (14)  shares  the  view  of  many  investigators  that  fatigue  is  a  central  neurophysio¬ 
logical  condition  and  is  located  in  the  central  nervous  system  more  specifically,  in  the  brain  stem 
reticulcr  activation  system.  His  conceptualization  of  fatigue  as  a  conditon  of  the  central  nervous 
system  is  based  on  early  studies  of  the  role  of  the  brain-stem  reticular  formation  in  producing  and 
maintaining  various  levels  of  inactivity,  arousal,  and  activation. 

Welford  has  also  suggested  that  considering  fatigue  as  a  central  phenomenon  attempts  to  integrate 
the  comparatively  less  accessible  condition  of  mental  fatigue  with  the  more  readily  observed  condition 
of  neuromuscular  fatigue.  He  states,  "It  appears  that  in  the  intact  organism  changes  in  the  muscles 
brought  about  by  prolonged  or  repeated  contractions  can,  according  to  circumstances,  have  one  of  two 
limiting  effects.  Either  the  muscles  themselves  become  temporarily  incapable  of  further  contraction  or 
the  condition  of  the  muscles  produces  afferent  stimuli  and  these  in  term  affect  the  central  mechanisms 
and  lead  to  the  cessation  oi  efferent  Impulses."  If  the  term  mental  fatigue  Is  to  have  a  meaning  In 
line  with  that  of  neuromuscular  fatigue,  it  must  denote  the  Impairment  of  some  brain  mechanism  ns  a 
result  of  long  continued  use.  The  impairment  must  be  reversihle  in  the  sense  that  it  disappears  with 
rest,  and  may  take  the  form  of  lowered  sensitivity,  or  lowered  responsiveness,  or  lowered  capacity. 

This  definition  by  Welford  permits  a  distinction  to  be  made  between  mental  fatigue  and  other  central 
conditions  such  as  adaptation,  habituation,  and  monotony  or  boredom  which  also  lead  to  a  decrement  in 
performance  over  time.  However,  others  see  no  significance  differences  In  operational  definitions  of 
reactive  inhibition,  habituation,  and  central  fatigue.  Grandjean  (13)  expresses  the  popular  view  that 
boredom  are  components  of  the  fatigue  condition  a.iu  are  related  to  the  task  situation:  "If  llie  wutkload 
is  too  heavy,  fatigue  due  to  physical  or  mental  effort  is  to  be  expected;  if  the  worker  is  underloaded 
or  forced  to  conduct  repetitive  work,  fatigue  due  to  monotony  will  be  produced." 

Fatigue  as  a  Level  of  Energy  Expenditure:  The  energy  expenditure  approach  to  fatigue  focuses  on  the  cost 
of  protracted  effort,  wnether  mental  or  physical,  in  terms  of  the  energy  investments  or  transformations 
required  to  sustain  it.  A  formulation  of  the  energistic  approach  by  Dukes-Dobos  (15)  defines  fatigue  as 
a  term  to  denote  a  normal  psychophyaiological  process  which  starts  immediately  after  the  beginning  of 
any  physical  or  mental  activity  and  which  consists  of  the  utilization  of  the  bodies'  energy  stores,  the 
accumulation  of  the  breakdown  products,  and  the  activation  of  adaptive  mechanisms  which  maintain  the 
homeostatis  of  the  organism." 

Cameron  (16)  considers  the  term  of  fatigue  to  be  no  more  than  a  useful  descriptive  term  for  a 
generalized  stress  response  over  a  period  of  time.  "The  human  stress  response  is  generalized  in  character, 
involving  the  whole  system  of  biological  emergency  mechanisms.  Since  it  implies,  by  definition,  an 
abnormal  demand  on  the  energy  resources  of  the  system,  it  is  fatiguing.  The  degree  of  fatigue  experienced 
may  depend  to  some  extent  on  the  level  of  the  stress  response,  but  will  depend  primarily  on  its  duration." 
Here,  he  emphasizes  the  duration  of  the  stress  response  of  the  organism,  not  necessarily  the  duration  of 
the  stressful  condition.  This  is  a  critical  distinction,  because  he  argues  that  the  length  of  time 
needed  to  return  to  a  normal  arousal  level,  that  is  a  normal  level  of  biological  emergency  mechanism 
activity  is  good  index  of  the  severity  of  fatigue. 

McFarland  (17)  has  criticized  the  focus  on  physiological  factors  and  fatigue  citing  the  familiar 
arguments  that  effects  observed  in  the  laboratory  are  not  always  found  in  actual  work  situations  and  that 
other  factors  often  influence  energy  reserves  and  utilization  capacities,  mainly  physical  condition  and 
motivation,  and  that  the  metabolic  costs  of  mental  work  are  very  slight.  In  this  argument,  characteriza¬ 
tions  of  the  pilot's  job  as  predominantly  cognative  and  not  physical  or  muscular  are  frequently  cited 
to  question  the  relevance  of  physiological  factors,  especially  those  derived  from  studies  of  heavy  phy  (cal 
workloads.  It  would  seem  that  if  this  concept  of  fatigue  as  a  level  of  energy  expenditure  is  to  bear 
fruit  we  must  have  a  clear  focus  on  the  higher  order  concepts  of  energy  mobilization  and  channeling  in 
the  individual,  rather  than  focusing  on  the  localization  and  reduction  of  this  response  to  metabolic 
activities  In  particular  muscles  or  tissues. 

Welford,  who  views  both  mental  and  neuromusclar  fatigue  as  effects  of  loading,  would  agree  that 
fatigue  ia  a  consequence  or  concomitant  of  workload.  Bartley  would  also  agree  with  this  relationship 
while  insisting  that  fatigue  is  a  condition  of  the  individual  and  is  not  to  be  defined  in  terms  of 
external  situations  or  even  work  products.  In  this  situation,  he  considers  energy  expenditure,  paced 
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performance,  prolonged  activity,  and  demanda  upon  particular  body  mechanisms  to  be  typically  fatigue- 
producing. 

The  primary  difficulty  In  applying  fatigue  assessment  techniques  more  explicitly  is  the  multi¬ 
dimensional  character  of  fatigue  phenomena  and  their  interaction  with  even  more  complex  phenomena  of 
individual  motivation  and  stress  tolerance.  The  approach  of  Bartlett  to  fatigue  assessment  utilizing 
the  application  of  the  concept  of  skill  fatigue  is  Important  since  observable  changes  in  pilot  behavior 
during  primary  task  performance  can  be  clearly  and  directly  related  to  the  accomplishment  of  flight 
management  and/or  aircraft  control  objectives.  It  must  be  concluded  that  factors  other  than  task 
demands  or  protracted  effort  are  more  significant  In  the  occurrence  of  fatigue.  These  other  factors 
Include  Individual  differences  In  personality,  motivation,  physical  flttneas,  and  life  style,  as  well 
as  such  sltuatloual  factors  as  operational  management  policies,  disruption  of  established  biorhythms, 
sleep  patterns,  and  exposure  to  various  environmental  stressors.  The  relative  contribution  of  personal 
versus  task-specific  fatigue  factors  Is  an  Important  unresolved  Issue. 
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Tc  paraphase  the  Webster  Dictionary  definition  of  stress,  we  find  that  stress  Is  a  physical, 
chemical,  or  emotional  factor  to  which  an  individual  falls  to  make  a  satisfactory  adaption  and  which 
causes  physiologic  tensions  that  may  be  a  contributory  cause  of  disease.  While  this  publication  is  not 
the  format  for  a  clinical  discussion  of  the  psychological,  psychiatric,  and  biological  aspectB  of  the 
effects  of  stress  on  the  airmen  and  its  related  effects  upon  the  man-machine  interface,  the  concept  of 
stress  and  its  decrementing  role  Is  sn  Important  one  In  modern  aerospace  flight. 

In  his  discussion  of  psychology  and  flying  fatigue,  Hartman  (1)  defines  acute  fatigue  as  th  *  which 
occurs  in  a  single  flight,  during  a  single  day,  or  more  appropriately,  between  a  pair  of  sleep  periods. 

Here  the  recovery  from  acute  fatigue  Is  a  function  of  the  adequate  amount  of  rest  available.  But  even 
without  prolonged  or  even  relatively  short  rest  periods,  the  fatigued  filer  can  mobilize  his  resources 
and  return  briefly  to  near  rested  levels  of  efficiency  when  the  occasion  demands.  Hartman  also  defines 
cumulative  fatigue  as  that  which  occurs  over  a  period  of  days  or  weeks,  and  is  the  result  of  Inadequate 
recovery  from  successive  periods  of  acute  fatigue.  Recovery  from  cumulative  fatigue  Is  also  dependent 
upon  adequate  rest.  However,  without  an  adequate  recovery  schedule,  the  pilot  finds  himself  fighting  an 
enhanced  workload  self-generated  by  his  own  loss  in  airmanship  and  efficiency,  and  finds  that  the  longer 
cumulative  fatigue  continues  to  build  uo,  the  longer  it  will  take  for  him  to  recover  his  reserve  and  his 
capacity  to  mobilize  himself  to  meet  high  demand  situations. 

The  term  "chronic  fatigue"  has  a  special  psychiatric  meaning  and  is  defined  as  a  neuropsychiatric 
disease.  Chronic  fatigue  Is  a  psychoneurotic  symptom  characterized  primarily  by  difficulty  in  committing 
oneself  to  an  active  or  aggressive  course  of  action  and  by  a  generalized  withdrawal  or  retreat  from  a 
conflict  which  Is  Intolerable  for  situational  or  personality  reasons.  Thus,  this  entity  is  rarely  seen 
in  either  the  military  or  civilian  pllot/alrcrew  member. 

Like  fatigue,  stress  has  its  acute  phases,  one  of  which  is  an  alerting  arousal  response  enabling 
the  person  to  perform  better  and  to  otherwise  adapt  himself  to  an  emergency.  Cumulative  stress,  on  the 
other  hand.  Is  a  build-up  of  physiological,  chemical  and  emotional  factors  over  a  period  time  until 
some  kind  of  maladsptlon  occurs.  As  Selye  (2)  points  out,  stress  is  a  reasonably  normal  component  of 
modern  every  day  life  and  can  be  adaptive,  but  cumulative  stress  becomes  maladaptive  and  ultimately  then, 
stress  becomes  distress. 

Selye  (3)  has  also  discussed  his  general  adaptation  syndrome  which  he  conceptualizes  as  the 
defensive  response  of  the  body,  through  the  endocrine  system,  to  systemic  injury  evoked  by  stress.  This 
is  worked  out  by  an  initial  stage  of  shock,  like  an  arousal  or  surprise  reaction,  followed  by  a  stage  of 
growing  resistance  to  the  injury  (adaptation),  followed  in  turn  by  a  final  stage  of  healing  or  exhaustion 
and  death  If  adaptation  falls.  Note  that  there  is  no  alternative  course  of  action,  one  must  either 
resist  the  bodily  effects  of  stress  by  healing  or  one  must  become  exhausted,  and  ultimately  be  defeated 
by  the  effects  of  stress.  In  short,  then,  one  cannot  ignore  the  effects  of . cumulative  stiess. 

Sparks  (4)  in  a  chapter  entitled  "The  Clinical  Aspects  of  Psychiatric  Illness  in  Fliers,"  brings  to 
light  a  relatively  interesting  aspect  of  a  pilot’s  career  progress.  One  could  call  this  situation  one 
of  selective  screening,  because  flying  personnel,  through  the  almost  automatic  process  by  which  they 
learn  the  skills  required  for  modem  pilotage,  be  It  military  or  civil  aviation,  are  screened  for  emotional 
stability.  During  their  initial  period  of  training,  they  have  close  supervision,  with  exposure  to  moderate 
levels  of  stress  together  with  the  requirements  of  rigid  discipline  demanded  by  attention  to  procedures 
and  the  awareness  that  th?  aerospace  environment  is  an  unforgiving  mistress.  In  order  to  successfully 
pursue  his  chosen  career,  the  pilot  must  adapt  to  the  early  stresses  of  flight  training.  Flight  Itself 
poses  additional  stressors  to  which  he  must  form  adaptive  methodologies  or  strategies  which  act  as  a 
further  screening  process.  Following  the  completion  of  flying  training,  additional  periods  of  flying 
duties,  upgrading  and  so  forth,  further  cause  him  to  adapt  more  and  more  strategies  for  coping  to  the 
point  that  almost  any  unstable  individual  would  be  self-eliminated  prior  to  any  operation  assignment. 

For  the  military  pilot,  combat  poses  additional  and  unique  stressors  to  which  he  must  adapt,  ground 
himself,  or  be  ultimately  surrendered.  It  1b  a  small  wonder  that  we  expect  to  find  any  psychiatric 
casualties  once  this  screening  process  is  completed. 

However,  the  very  aspect  of  recognizing  this  screening  process  implies  that  we  also  recognize  that 
we  have  a  highly  selected  individual  who  will  almost  invariably  stand  up  to  most  of  life's  stresses. 
Therefore,  we  are  inclined  to  form  an  almost  mythical  concept  of  the  pilot  as  being  inviolate  and 
unaffected  by  ordinary  and  inordinate  stressors.  This  of  course,  is  far  from  true.  Carlos  Perry  In  a 
chapter  on  aerospace  psychiatry  (5),  examines  some  the  stressors  of  aerospace  operation.  He  points  out 
that  "potential  danger,  physical  discomfort,  energy  demands,  attitude,  and  enforced  physical  passivity 
It av^  ;  eer  well  recognized  as  stresses  and  need  little  further  elaboration.  He  also  shows  that  Increased 
specialisation  In  aerospace  operations  is  the  source  of  stressors  that  were  not  apparent  when  flying 
activities  were  moTe  generally  uniform.  Thus,  a  given  airman  may  well  tolerate  stresses  associated  with 
flying  long  conventional  type  cargo  hauls  in  the  company  of  a  crew,  but  not  be  able  to  successfully  cope 
with  stresses  of  solitary,  short  duration,  high  altitude  Intercept  flights  in  high  performance  single 
engine  jetcraft.  He  points  out  the  incongruity  involved  in  the  military  concept  of  alert.  Here,  we  have 
aircrew  men  who  are  Interested  In  flying,  going  places,  and  seeing  things  and  we  force  them  to  sit  for 
long  periods  of  time  in  an  alert  facility,  away  from  family  and  other  satisfying  activities.  Perry  states 
out  that  boredom  has  also  become  a  major  stress  factor.  Here  automation,  lack  of  diversification,  endless 
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routines,  and  increasing  length  of  Individual  flights  contribute  to  the  production  of  boredom.  He  states 
that  "while  boredom  may  be  considered  to  be  a  benign  type  of  stress,  that  these  feelings  are  not  far 
removed  from  the  more  serious  feelings  of  lack  of  ambition,  futility,  or  even  depression."  He  points  out 
that  the  nature  of  aerospace  operations  are  source  of  another  major  category  of  stress,  with  the 
necessities  of  long  and  frequent  travel,  varying  periods  of  absences  which  can  be  a  ource  of  severe 
stress  to  marltial  and  parental  activities. 

Even  the  complexities  posed  by  the  various  types  of  aircrew  equipment  for  providing  a  livable 
environment  for  sun  Imposes  their  own  constraints  and  physiologic  stresses.  Mission  and  operational 
requirements  present  the  modern  pilot  and  crew  with  ever changing  complex  tasks  which  provide  another  form 
of  stress.  These  major  sources  of  aircrew  stress  are  compounded  by  the  Individual's  Internal  psycho- 
physloluglc  reaction  to  stress  and  to  general  external  stressors,  such  as  personal,  career  or  family 
problems. 

The  human  body  is  known  to  sdapt  to  or  to  withstand  severe  conditions  of  Inaedlate  or  acute  stress. 
However,  it  does  not  respond  as  well  to  long-term  or  cumulative  stress,  whatever  lta  source.  While 
aircrew  personnel  are  subject  to  special  forms  of  stress  the  basic  reaction  to  stress  Is  uniform.,  All 
stress,  physical,  emotional  and  so  forth,  la  responded  to  by  some  kind  of  an  adaptive  or  avoidance 
reaction.  The  basic  need  Is  to  protect  oneself  from  m>re  and  more  stress.  Physically  the  Initial 
response  is  a  musculoskeletal  tension/arousal  response,  which  together  with  the  corresponding  changes  In 
glandular,  organ,  and  nervous  systems  prepare  the  individual  to  retreat  from  the  situation  or  to  confront 
it,  the  classic  flight  or  flight  reaction. 

In  the  case  of  cumulative  stress,  the  musculoskeletal  and  organ  system  of  the  body  tend  to  be 
continually  actlvlated  to  the  point  where  the  Individual  Is  now  stressed  even  when  the  original  source 
of  stress  Is  absent.  The  stress  becomes  internalised  and  most  stimuli,  either  internal  or  external, 
become  sources  of  stress.  We  now  have  a  chronically  tense,  irritable,  agitated,  disturbed  person. 

As  cumulative  stress  continues,  the  musculosHeletal  and  organ  systems  of  the  body  may  start  to 
undergo  pathologic  changes.  We  begin  to  see  psychosomatic  systems  of  stress  In  the  muscles  of  the  neck 
and  shoulders  or  other  parts  of  the  body.  Chronic  muscle  tension  produces  decreased  blood  flow  in  these 
tissues,  with  pain  and  joint  pathology.  Chronic  organ  reactions  yield  typical  symptoms  of  the  gastro- 
intestional  tract,  such  as  stomach  cramps,  ulcer,  colitis,  and  so  forth. 

It  13  obvious  that  chronic  stress  and  Its  related  pathology  cannot  be  Ignored.  The  aircrew  member 
who  experiences  cumulative  stress  from  one  or  multiple  sources  and  who  is  then  further  subjected  to 
additional  Increments  of  stress  from  personal  or  aircraft  equipment,  from  the  demands  of  the  mission,  or 
from  fatigue  or  external  stressors  will  find  his  best  skills  and  efforts  decremented.  This  degrading 
of  performance  is  obviously  related  to  the  disaster  or  near-disaster  of  the  aircraft  accident,  but  not 
necessarily  In  the  causal  sense.  A  recent  survey  of  USAF  accidents  falls  to  support  degraded  performance 
or  stress  as  a  causative  agent.  Instead,  stress  and  decremented  performance  are  seen  as  factors  which 
are  contributory  in  that  they  act  to  "set  the  stage",  preparing  the  psychologic  and  physiologic  world 
of  the  pilot  in  such  a  manner  that  he  is  not  able  to  respond  effectively  to  one  or  more  additional  untoward 
events.  Tlila  is  the  Insidious  danger  of  stress  pathology  and  constitutes  an  excellent  reason  for  including 
a  pre-accident  mental  status  Investigation  as  part  of  the  accident  review  process.  However,  as  In  most 
things,  "an  ounce  of  prevention  is  worth  a  pound  of  cure."  How  can  we  prevent  stress,  or  better  yet,  how 
can  we  prevent  dls  s trees  as  the  result  of  cumulative  stress? 

There  Is  a  solution  for  cumulative  stress.  It  has  long  been  known,  in  a  simple  minded  way,  that  one 
cannot  be  tense  and  relaxed  at  the  same  time.  Thus,  the  adaptive  response  to  stress/tension  is  one  of 
relaxation.  Adequate  recovery  times  from  periods  of  cumulative  stress  with  provision  for  recreation  are 
Important.  Perhaps  even  more  Important,  however,  would  be  a  conditioned  learning  program  wherein  the 
individual  Is  taught  to  avoid  the  effects  of  cumulative  stress  by  keeping  himself  in  a  relatively  relaxed 
state.  There  are  several  long-time  approaches  to  this  type  of  training.  One  of  the  earliest  being  that 
of  Schultz  with  his  autogenic  training  followed  by  Jacobson's  progressive  relaxation  training  and  more 
recently  by  such  medlative  techniques  as  transendental  mediation  (TM) .  An  Intriguing  modern  day  addition 
to  these  forms  of  relaxation  methodologies  is  that  of  biofeedback  techniques  where  an  electronically 
generated  signal  from  the  muscle  or  organ  system  Involved  is  available  to  the  Individual  as  a  learning 
technique.  This  Is  based  on  an  axiom  in  Information  theory  that  states  "the  controllor  giving  information 
about  the  state  of  the  system  can  then  exercise  control  over  that  system."  It  has  been  demonstrated  that 
the  central  nervous  system  can  exercise  exquisite  control  over  the  CHS,  the  spinothalamic  ard  automonlc 
nervous  systems.  Biofeedback  1b  merely  one  way  of  giving  the  controller  information  about  the  state 
of  the  system  ao  that  he  can  learn  to  exercise  the  necessary  control.  While  biofeedback  simply  utilizes 
modern  electronic  technology,  we  must  realize  that  there  are  many  adaptive  strategies  which  could  be 
employed  and  also  realize  that  human  beings  come  supplied  with  Internal  biofeedback  signals  which 
undoubtedly  play  an  important  role  In  both  adaptive  and  maladaptive  behaviors. 
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If  the  nature  and  entity  are  analyzed  of  the  varloue  streaming  and  fatlgulrg  factors  that  are  acting 
on  the  body  and  psyche  of  aircrafts'  pilots  during  their  specific  activity,  it  appears  obvious  that  under¬ 
lying  the  exercise  of  a  pilot's  profession  Is  a  basic  situation  which  ultimately  permeates  the  whole  of 
his  activity  and  axarts  a  multiplicity  of  effects  on  the  physique  and  psyche  of  the  same  pilot. 

The  fundamental  characteristics  of  such  a  situation  may  be  aumarlzed  as  follows: 

1.  Flying  Involves  the  use  of  a  machine  that  is  required,  unlike  other  machines,  to  respect  certain 
aerodynamic  laws.  Any  infraction  of  these  laws  Involves  an  immediate  risk  of  crash  and  accident.  In  the 
pilot's  professional  activity,  therefore,  life  depends  on  the  machine  and  its  continuous  efficiency,  a 
situation  that  in  actual  practice  expresses  itself  in  the  form  of  a  permanent  image  of  potential  "vulner¬ 
ability"  undoubtedly  present  in  the  subconscious  of  each  and  every  pilot. 


2.  Pilot's  activity  depends  a  great  deal  on  the  spatial  environment  of  the  aircraft,  the  three- 
dimensional  displacement  and  rapid  transition  of  the  aircraft,  and,  indirectly,  on  the  various  conditions 
that  have  repercussions  on  the  human  body  (i.e.,  accelerations,  acoustic  and  non-acoustic  vibrations, 
equipment,  sensorial  stress,  etc.).  All  these  conditions  constitute  links  in  a  chair  of  factors  which 
readily  explain  the  wealth  of  interferences  that  act  on  the  somato-psychic  balance  and,  consequently,  on 
performance,  adaptability  and,  in  the  long  tun,  on  individual  fatigue. 

3.  Flying  does  not  just  represent  a  technical  or  operative  activity,  i.e.,  a  job,  but  rather  "a 
vital  activity  and  an  'in  tote'  reaction  of  the  ego  to  the  environment." 

Upon  such  a  basic  substrate,  which  is  In  itself  potentially  stressing  and  qualitatively  common  to  all 
pilots  irrespective  of  their  specialization  and  the  type  of  aircraft  they  fly,  there  then  act  interferences 
due  to  the  various  physical  and  psychic  factors,  each  of  which  plays  a  specific  and  individualizing  role 
both  in  connection  with  general  and  particular  aircraft  (fighter,  transport,  reconnaissance,  rescue,  or 
helicopter,  etc.). 

It  would  certainly  be  interesting  and  Important  If  it  were  possible  to  define  the  degree  and  limits 
of  such  psychophysical  workload  by  means  of  technically  valid  and  acceptable  scientific  methods  with  a 
view  to  obtaining  differential  qualitative  and  quantitative  assessments  of  the  various  flying  speciali¬ 
zations.  In  fact,  numerous  methods  have  been  proposed  periodically  for  obtaining  a  measure  of  workload 
by  quantitatively  evaluating  the  functional  changes  that  fatigue  can  produce.  As  known,  such  changes  uay 
consist  of  an  increase  of  the  duration  and  inconstancy  ot  the  psychomotor lal  reaction  times;  an  increase 
of  the  latency  time  of  the  pupillar  reflex;  a  diminution  of  the  capacity  for  rapid  binocular  fusion;  an 
Increase  of  the  acccwodation  time  for  near  and  distant  vision;  a  reduction  of  the  critical  flicker  fusion 
frequency  (3,  12)  and  changes  of  other  ophthalmic  index's;  modifications  of  the  characters  and  duration  of 
the  monosynaptic  spinal  reflaxes  produced,  for  example,  in  the  area  of  the  sciatic  nerve  (1);  variations 
in  the  duration  of  the  central  nervous  time  of  the  orbicular  blinking  reflex  under  light  stimulation,  and 
the  time  needed  for  a  complex  montal  process  (11);  reduction  in  muscular  force  and  muscular  tone;  Increased 
instability  in  neuromuscular  coordination;  Increased  loss  of  electrolytes  through  cutaneous  sweating; 
reduced  circulating  plasma  volume;  variations  in  the  urinary  excretloa  of  corticosteroids  (7)  and  cathe- 
colamlne  (2,  6);  variations  in  the  lactscidemla,  glycrala,  and  cholesterolemla  values,  the  ratio  between 
alpha  and  bets  lipoproteins,  the  number  of  the  eoslnoyhllea,  and  the  hematocrit  index;  and,  finally, 
electrocardiographic  changes  and  variations  in  the  Huffier  and  Dickson  index  of  cardiac  resistance  (4). 

Quite  obviously,  however,  all  these  methods  lend  themselves  very  readily  to  critlciei.  Indeed,  none 
of  the  results  obtained  by  these  methods  are  capable:  of  being  interpreted  in  a  unique  manner.  In  fact, 
these  methods  messure  functional  charges  that  are  or  can  be  Influenced  considerably  by  a  wealth  of  other 
factors,  both  endogenous  and  exogenous,  including  first  and  foremost,  the  subject's  age.  Therefore,  if 
one  wanted  to  make  a  comparative  evaluation  of  the  amount  end  precocity  of  the  stress  and  the.  psycho¬ 
physical  workload  produced  by  the  individual  strensing  factors  connected  with  flying,  one  would  admit  that 
It  is  extremely  difficult  to  find  a  precise  differential  criterion  that  could  be  used  to  obtain  a  quanti- 
tatlvs  graduation  of  this  workload.,  This  is  not  only  because  the  subjer  _  element,  here  understood  as 
ths  individuality  and  extreme  variability  of  the  response  of  the  single  subject  to  every  type  of  stimulus, 
has  a  predominant  weight  in  this  particular  activity;  it  also  depends  on  the  nature  and  entity  of  the 
reaction  to  any  kind  nf  stimulus  which  arc,  in  turn,  conditioned  by  numerous  and  extremely  variable 
individual,  environmental,  and  circumstantial  factors. 


After  those  necessary  premises  concerning  tbs  difficulties  of  a  unique  interpretation  of  all  pro¬ 
posed  di (gnostic  methods  and  the  preponderance  of  psychical  workload  on  physical  one  in  the  piloting 
aircrafts,  it  it  oor  opinion  that  between  the  above-mentioned  functional  changes  eventually  produced  by 
emotional  and  psychic  fatigue,  a  particular  attention  could  be  reserved  -  in  Aviation  Medicine  -  to 
variations  in  the  ut lnary  excretion  of  corticosteroids  snd  especially  catecholamine. 

It  la  well  known  that  every  stress  -  no  matter  if  physiological  or  emotional  -  is  capable  of  inducing 
organic  reactions  due  to  the  increase  of  corticosteroids  and  catecholamines  in  the  blood  circulation. 
According  to  many  authors  who  have  studied  the  phenomenon  in  the  aviation  field,  there  are  increases  also 
in  particular  flight  conditions,  particularly  those  likely  to  set  up  a  state  of  stress. 
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Therefore,  (t  li  poeelble  to  conclude  that  the  determination  of  the  urinary  excretion  of  catecho¬ 
lamines  In  particular,  aa  Indication  of  a  possible  psychic  stress,  could  be  used  as  a  method  to  objectify 
"emotions."  This  could  have  a  useful  application  in  practice  to  reveal  emotional  atatea  undergone  in 
flight,  particularly  during  the  phase  of  training  and  other  all  conditions  of  considerable  psychical 
engagement  In  the  course  of  aeronavigation. 

In  other  word:,  the  determination  of  such  substances  would  then  give  useful  Information  about  the 
preeence  of  stress  and  would  also  allow  to  evaluate  the  intensity  of  the  latter  (and  of  consequent  work¬ 
load).  The  same  evaluation  may  also  be  obtained  by  determining  the  quantity  of  vrnllmandellc  acid  (VMA) 
excreted  with  urine,  such  an  acid  taking  its  origin  from  the  metabolism  of  catecholsmines  (5). 

These  methods  might  be  usefully  aud  practically  employed  with  the  purpose  of  obtaining  an  objective 
measurement  of  the  emotional  aspects  of  the  human  personality  In  real  conditions,  and  then  quantitatively 
evaluating  the  workload  (especially  psychic,  but  also  physical  and  physiological  workload)  In  the  pilot’s 
professional  activity. 

REFERENCES 

1.  Gualtlerotti,  T.,  R.  Margarla,  and  D.  Spinelll.  1958.  Effects  of  stress  on  lower  neuron  activity. 
Exper.  Med.  Surg.  16:166. 

2.  Dlepplng,  J.,  0.  Buis  son,  J.  C-nerrln ,  A.  Eecousse,  and  J.  P.  Dldier.  1963.  Evaluation  de 

1' elimination  urlnaire  des  catecholamines  chez  des  pllotes  d'avlona  a  reaction.  C.R.  Soc.  Biol. 
157:1727. 

3.  Krugman,  H.  E.  1947.  CFF  as  a  function  of  anxiety  reaction:  en  exploratory  study.  Paychosom. 

Med.  9:269. 

4.  Le  Roux,  R.  1960.  La  fatigue  operationelle  des  pllotes  d 'helicopteres.  Revue  des  Corps  de  Sante 
1:493. 

5.  Paoluccl,  G.,  and  G.  Blundo.  1973.  Determination  of  emotional  condition  In  student  pilots  during 
air-navigation  by  dosing  vanilmandelic  acid  (VMA)  excreted  with  urine.  Riv.  Med.  Aer.  Spas. 

36:184. 

6.  Paoluccl,  G.,  and  G.  Blundo.  1975.  Catecholamlnlc  excretion  In  student  pilots.  Rlv.  Med.  Aer. 
Spas.  38:27. 

7.  Rotondo,  G.  1955.  On  the  treatment  of  pilots  affected  by  operational  fatigue  with  dehydrolso- 
androsterone.  Rlv.  Med.  Aer.  18:78. 

8.  Rotondo,  G.,  and  A.  M.  De  Angells.  19t>6  Acetll-aspartlc  acid  and  cltrulllne  In  treatment  and 
prevention  of  flight  fatigue.  Rlv.  M^d.  Aer.  Spas.  29:85. 

9.  Rotondo,  G.  1969-  Experimental  contribution  to  preventive  and  therapeutic  treatment  of  flight 
fatigue.  Rlv.  Med.  Aer.  Spaz.  32:231. 


10.  Rotondo,  G.  1977.  Horkload  and  operational  fatigue  in  helicopter  pilots.  Avlat.  Space  Environmental 
Med.  (In  print). 

11.  Spinelll,  D.,  and  F.  Cerretelll.  1961.  Analysis  of  central  nervous  functions  In  particular 
physiological  conditions.  Med.  Sport.  1:128. 


12.  Vozza,  R.  1955.  Flicker  fusion  frequency  as  a  test  of  operational  fatigue  in  Jet  pilots.  Riv.  Med. 
Aer.  18:771. 


13 


U,  iWi.'TOHimw/W 


PHYSIOLOGIC  ASPECTS  OF  WORKLOAD/FATIGUE/STT.ESS* 

Ly 

Layne  P.  Per ell 1,  Captain,  USAF 
Crew  Technology  Division 
USAF  School  of  Aerospace  Medicine  (AFSC) 

Brooks  Air  Force  Base,  Texas  7S23S 
USA 

It  is  important  to  recognize  that  the  physiological  mechanisms  of  the  organism  do  not  particularly 
care  nor  are  they  necessarily  aware  that  they  are  reacting  to  the  effects  of  workload,  the  effects  of 
fatigue  or  the  effects  of  stress.  Physiological  mechanisms  provide  a  common  link  between  the  concepts 
of  workload,  fatigue  and  stress.  Traditionally  the  basic  physiological  approach  to  fatigue  involves 
the  measurement  of  energy  expended  in  performing  a  given  amount  of  work.  As  early  as  191!#  to  1920, 

Waller  and  De  Decker  (1)  measured  the  carbon  dioxide  production  of  workers  and  were  able  to  relate 
Increases  in  carbon  dioxide  production  to  a  reduction  of  work  output  during  a  night's  activity.  They 
use  the  term  "physiological  cost"  to  describe  the  Increased  metabolic  demands  resulting  from  Increased 
fatigue  and  related  lowered  performance.  Page  (2)  (3)  has  suggested  that  the  concept  of  fatigue  be 
replaced  with  the  concept  of  metabolic  cost,  and  Bittenrin  (4)  has  suggested  that  the  concept  of  fatigue 
be  defined  as  a  reduced  efficiency  resulting  from  continued  work  and  reversible  by  rest,  with  efficiency 
defined  as  the  ratio  of  performance  to  expended  effort.  Effort  was  to  be  determined  from  metabolic  cost 
indices. 

Concepts  of  physiologic  cost  are  related  _o  Selye'9  concept  of  the  general  adaptation  syndrome  in 
which  any  stress  to  which  the  body  is  exposed  creates  an  overall  non-specific,  systemic  reaction  to  cope 
with  or  reduce  the  stress  (5) .  It  is  theorized  that  fatigue  creates  a  stressful  condition  to  which  the 
body  tries  to  adapt,  and  in  so  doing  produces  an  abnormal  set  of  physiologic  indicators  which  can  be 
evaluated  as  to  the  severity  of  the  fatif ue/stressor . 

After  reviewing  several  fatigue  studies  showing  no  signiflclant  or  dramatic  performance  decrement 
and  one  study  with  a  performance  increase,  Cameron  (6)  concludes  that  performance  measures  are  too  erratic 
and  unreliable  to  serve  as  indicators  of  fatigue.  He  feels  thut  the  term  fatigue  should  be  used  as  no 
more  than  a  descriptive  ter  •>  v or  generalized  stress  response  ever  a  period  of  time,  and  that  the  best 
index  of  acute  ani  chronic  effects  would  be  the  time  required  for  biologic  emergency  mechanisms  to  return 
to  a  normal  arousal  level. 

Pursuing  this  same  line  of  thinking,  Harris  and  O'Hanlon  (7)  provide  a  review  of  what  is  known  about 
the  recovery  of  man  from  exposure  to  certain  adverse  conditions,  such  as  sleep  deprivation,  abnormal 
work/rest  cycles,  prolonged  physical  work,  and  environmental  and  situational  stressors.  Their  purpose 
was  to  determine  if  recovery  functions  can  predict  how  long  a  man  can  maintain  effective  performance 
before  he  must  be  relieved  and  how  long  a  rest  period  is  required  before  he  is  ready  again  to  perform 
effectively  during  continuous  military  operations.  They  conclude  that  while  there  is  insufficient 
knowledge  now  available  to  make  such  predictions,  the  following  list  of  potential  physiological  failures 
seems  most  important  to  consider  and  reversal  of  these  impairments  may  provide  practical  indications 
that  recovery  has  taken  place:  1)  Degraded  physical  working  capacity.  2)  Inadequate  iron  reclamation, 

3)  Mild  cardial  fatigue,  4)  Faroxysmal  cerebral  cortical  activity,  5)  Iv paired  carbohydrate  metabolism, 

6)  Thiamine  deficiency,  7)  Involuntary  hypohydration,  8)  Glycogen  exhaustion,  9)  Increased  susceptibility 
to  infection,  1C)  Imbalanced  protein  metabolism  and  11)  Adrenal  cortical  and  medullary  exhaustion.  They 
feel  like  Cameron  that  changes  due  to  fatigue  will  become  apparent  in  th i  physiological  systems  before 
performance  degradation  occurs.  This  implies,  of  course,  that  even  thoujh  a  given  schedule  of  work  has 
not  yet  produced  performance  decrement,  work-rest  cycles  should  be  Stic tured  so  that  severe  changes  in 
the  physiological  systems  ore  prevented. 

Following  this  same  concept  that  the  physiologic  cost  of  fatigue  is  generally  not  an  Immediate 
problem  providing  the  individual  receives  sufficient  recovery  time,  Hartman  and  Cantrell  (8)  have  token 
the  position  that  the  best  approach  to  maintain  man's  capacity  for  skillful  work  is  to  engineer  the 
system  so  that  physiological  degradation  is  eliminated.  This  Implies  that  if  physiological  indicators 
known  to  be  associated  with  stress  reactions  ate  found  to  be  within  normal  limits,  then  it  is  presumed 
that  no  performance  decrement  of  opera tioneil  consequence  has  occured.  Thus,  the  problem  is  to  quantify 
these  physiologic  limits  in  relation  to  a  criteria  of  performance  degradation  in  such  a  way  as  to  cause 
system  managers  to  design,  man,  and  use  an  operational  system  in  such  a  way  that  these  limits  are  not 
exceeded.  One  of  the  difficulties  in  using  physiological  indicators  for  evaluating  workload,  fatigue, 
or  stress  is  of  a  temporal  nature  in  that  some  physiological  responses  can  be  observed  only  after  periods 
of  hours  or  even  days  while  other  responses  occur  almost  Instantaneously.  Some  measures  are  unobtrusive 
which  could  be  used  in  operational  situations  yhil  .  others  are  somewhat  impractical  or  often  impossible 
to  obtain. 

We  will  first  discuss  the  long  term  physiological  indicators  of  stress  workload  and  fatigue  recovered 
from  the  organism  and  measured  as  urinary  metabolites,  namely  the  17-liydroxycortico  steroids  (17-OHCS) 
and  the  catecholamines  (epinephrine  and  norepinephrine) .  The  following  general  review  of  17-cortico 
steroids  and  catecholamines  is  taken  from  Guyton  (9). 

Steroids,  namely  cortisol  are  excreted  into  the  blood  stream  from  the  adrenal  cortex  in  response 
to  a  wide  variety  of  stresses.  Steroids  enable  the  body  to  cope  with  stress  through  its  effects  on 
carbohydrate,  fat  and  protein  metabolism.  It  causes  a  stimulation  of  gluconeogenals  by  the  liver  and  a 
decrease  in  glucose  utilization  by  the  cells  which  in  turn  raises  the  blood  glucose  concentration.  At 


*  This  material  was  abstracted  from  a  chapter  of  Captain  Perelli's  draft  doctoral  dissertation  by 
Richard  E.  McKenzie,  Ph.D. 
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the  ease  time  It  causes  a  reduction  In  protein  storus  In  all  parte  of  the  body  except  the  liver.  Blood 
aalno  add  concentration  goes  up,  transport  of  amino  acids  Into  extra  hepatic  cella  Is  diminished  and 
transport  of  amino  acids  to  the  liver  is  enhanced.  Amino  acids  are  thus  mobilised  from  the  tissues  to 
the  liver.  Finally,  fatty  acids  are  brought  out  of  adipose  tissue  Increasing  their  blood  concentration 
vhlch  Increases  their  utilization  for  enargy .  The  adrenal  cortex  secretes  steroids  in  response  to 
adrenocortlcotrophic  hormones  from  the  adenohypophysis  which  Is  under  direct  control  of  the  hypothalmua. 
With  this  Indirect  feedback  mechanism,  levels  of  cortisol  can  continue  to  rise  to  very  high  blood 
concentrations  as  long  as  the  stress  agent  continues  to  stimulate  the  hypothalmua  in  some  way.  Cortisol 
fixes  to  Its  target  tissues  in  about  20  minutes  after  release.  The  normal  blood  concentration  Is  about 
12  micrograms  per  100  milliliters  and  its  half  life  in  the  blood  Is  100  minutes.  The  normal  secretory 
rate  Is  IS  milligrams  per  day  of  which  approximately  75X  is  excreted  In  the  urine. 

At  this  point  it  should  be  obvious  that  one  can  measure  17  keto-eteroid  production  from  either 
blood  or  urine  sampling.  The  only  problem  one  should  be  aware  of  la  that  there  Is  a  difference  In  the 
concentration  time  of  17-0HCS  found  In  blood  plasma  as  opposed  to  urine  by  about  two  hours.  Increases 
In  17-OHCS  excretion  have  been  found  for  various  anxiety  producing  situations,  such  as  electroshock 
treatment  and  with  the  use  or  administration  of  hallucinogenic  drugs  and  In  the  viewing  of  mildly  stress¬ 
ful  motion  pictures.  Berkum,  Blalek,  Kern  and  Tag!  (10)  performed  an  extensive  series  of  experlements 
simulating  five  stressful  military  situations  In  which  the  subject  was  led  to  believe  that  he  was  in 
lmedlate  danger  of  losing  his  life  or  of  being  seriously  injured,  or  that  by  his  actions  he  has  seriously 
Injured  one  of  his  colleagues.  All  of  these  stress  situations  resulted  In  elevated  17-0HCS  excretion 
and  the  level  of  the  Increase  was  related  to  the  presumed  level  of  stress  induced  for  each  situs'. Ion. 

Miller  (11)  provides  a  review  of  the  many  studies  in  which  17  keto-sterolds  have  been  found  to 
Increase  due  to  the  stress  of  military  flying.  In  1943,  Plncus  and  Hoagland  (12)  conducted  three  sets  ot 
experiments  which  related  steroid  excretion  and  flying  stress.  They  reported  not  only  significantly 
Increased  steroid  production,  but  that  individual  performance  scores  were  positively  related  to  the  level 
of  steroid  increase.  They  also  reported  that  Increases  in  steroid  production  were  found  to  be  related  to 
Independent  rating  given  by  the  pilot's  squadron  commander  on  their  individual  surjceptablllty  to  fatigue. 

Catecholamines :  Catecnolamlnes  are  secreted  by  the  adrenal  medulla  In  response  to  stimulation  from  the 
sympathetic  nervous  system.  The  relation-ship  between  the  adrenal  medulla  and  a  threatlng  situation 
was  first  demonstrated  by  Cannon  and  de  la  Faz  (13).  While  the  proportions  of  catecholamines  which  are 
excreted  depend  upon  the  physiologic  conditions,  on  the  average,  75Z  epinephrine  and  25X  norepinephrine 
are  excreted.  Their  effects  on  the  body  are  the  same  as  those  caused  by  direct  stimulation  >f  the 
sympathetic  nervous  system,  but  the  effects  last  about  ten  times  longer  since  the  circulating  catechol¬ 
amines  are  only  slowly  removed  from  the  blood.  It  should  be  noted  that  the  sympathetic  nerve  endingB 
excrete  norepinephrine,  but  in  a  matter  of  seconds  it  Is  reabsorbed  or  destroyed  at  the  cellular  level 
by  O-methyl  transferase  or  monoamine  oxidase.  These  enzymes  are  similar  to  cholinesterase  which  destroys 
acetychollne,  the  agent  excreted  by  the  parasympathetic  nervous  system.  While  both  the  sympathetic 
nervous  system  and  the  excretions  of  the  adrenal  medulla  have  general  nonspecific  effects,  the 
catecholamines  stimulate  and  Increase  the  metabolic  rate  of  every  cell  in  the  body.  However,  it  must  be 
noted  that  circulating  catecholamines  do  not  readily  pass  the  blood-brain  barrier.  This  means  that 
central  nervous  system  physiology  is  not  as  reactive  to  these  circulating  substances  as  Is  the  rest  of 
the  body  physiology. 

The  general  result  of  stimulation  of  the  sympathetic  nervous  system  Is  to  mobilze  the  body  for 
action.  Norepinephrine  causes  general  vasoconstriction,  increased  cardiac  activity,  increased  basal 
metabolism,  sweating,  inhibition  of  the  gastrointestinal  tract,  glucose  release  from  the  liver,  decreased 
kidney  output,  and  adrenocortical  secretion.  Epinephrine  has  similar  effects  but  has  a  greater  stimulat¬ 
ing  effect  on  cardiac  activity  and  basal  metabolism  and  has  a  less  constricting  effect  on  the  vascular 
system  of  the  skeletal  muscular  system.  Normal  resting  secretion  rates  are  micrograms  per  kilogram 
of  body  weight  per  minute  for  epinephrine  and  .07  micrograms  per  kilogram  of  body  weight  per  minute  for 
norepinephrine. 

While  there  Is  some  Indication  that  catecholamines  are  excreted  due  to  stress,  they  are  generally 
released  In  relation  to  the  overall  activity  level  or  performance  level .  In  a  review  of  catecholamine 
response  to  various  activities,  Euler  (14)  reports  that  mental  stress  associated  with  anger,  aggression, 
or  exhilaration  will  increase  norepinephrine  excretion  while  emotional  states  characterized  by  appre¬ 
hension,  discomfort,  painful  or  unpleasant  feelings,  will  increase  epinephrine  excretion.  As  an  example 
of  what  one  may  expect  to  find  In  measures  of  catecholamine  levels,  Euler  and  Lundberg  (13)  found  that 
urinary  epinephrine  levels  were  elevated  in  pilots  as  well  as  inexperienced  passengers  during  one  or  one 
and  one-half  hours  of  moderately  stressful  flights.  The  pilots  also  had  elevated  norepinephrine  levels 
while  the  passengers  did  not.  Melton  and  Fiorlca  (16)  found  that  both  epinephrine  and  norephrlne 
excretions  were  elevated  during  cross-country  flights  in  private  pilots  with  less  than  100  hours  flying 
experience.  However,  the  levels  of  excretion  were  not  related  to  the  length  of  flyit  g  time.  A  more 
recent  study  by  Krahenbuhl,  Marett  and  King  (17)  explored  catecholamine  production  during  various  phases 
of  Air  Force  flying  training  in  the  T-37  jet  aircraft.  They  found  that  the  emergency  procedure  phase 
which  was  given  in  a  Link  trainer  was  essentially  n< "-.stressful but  that  both  epinephrine  and  norepineph¬ 
rine  were  signiflclantly  elevated  from  control  valut;  during  actual  spin,  solo  and  check  flights.  Here 
again,  the  assessment  using  epinephrine  appears  to  be  more  responsive  than  the  use  of  norepinephrine  as 
an  Indicator. 

Even  though  there  does  not  appear  to  be  any  functional  relationship  between  the  adrenal  medulla  and 
the  adrenal  cortex,  there  is  an  Interaction  of  catecholamine  In  steroids  effects  within  the  body. 
Broverman,.  Kleiber  and  Vogel  (IS)  have  attempted  to  differentiate  the  effects  of  short-term  versus 
long-term  stress  relative  to  the  Interaction  of  catecholamine  and  steroids.  Short-term  stress  is 
hypothesized  to  facilitate  performance  on  serially  repetitive,  overleamed  tasks  and  to  impair  performance 
on  novel  tasks  requiring  perceptual  restructuring.  Long-term  stress  is  hypothesized  to  have  the  opposite 
effects.  They  attempt  to  accov  for  these  finding  by  arguelng  that  during  short-term  stress  behavior 
Is  dominated  by  the  sympathetl  vous  system.  However,  with  increasing  exposure  of  t-he  central  nervous 
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system  to  the  stress-elicited  adrenal  hor.nonos,  dominance  shift*  to  the  .unsympathetic  system  causing 
en  overall  depression  of  activity. 

The  Cardiac  Indicators!  The  cardiac  activity  Indicators  neart  rate  (HR)  and  heart  rate  variability  (HRV) 
have  been  used  extensively  to  analyte  Inflight  pilot  activity  probably  becauae  the  data  can  be  collected 
without  gross  Interference  of  flight  activities.  In  addition,  heart  rata  can  be  measured  for  specific 
segments  of  performance  during  relatively  short  time  spans.  There  is  no  way  to  precisely  determine 
the  relative  contributions  of  any  segment  of  behavior  during  a  urine  collection  period  a.id  thus,  urine 
analysis  Is  confined  to  relatively  gross  estimates  of  when  performance  decrement  has  occurred.  In 
addition,  heart  rate  and  heart  rate  variance  appears  to  be  more  closely  related  to  activity  levels  and 
performance  quality  than  does  Information  or  catecholamine  production  revealed  by  urine  analysis.  One 
other  advantage  of  heart  rate  activity  Is  that  data  reduction  can  be  almost  Immediately  and  easily 
performed  while  urine  analysis  requires  one  or  two  days  of  chemical  analysis  in  the  laboratory  under 
fairly  optimal  conditions.  A  study  by  Bateman,  jet.  al. ,  (19)  shows  that  heart  rate  for  commercial 
pilots  on  routine  flights,  upgrade  training  flights,  and  simulator  flights  are  very  similar  and  higher 
than  resting  rates.  However,  basic  training  flights  were  found  to  be  significantly  higher.  Heart  rate 
Increased  In  response  to  specific  Inflight  stresses  and  when  pilots  were  demonstrating  maneuvers  requiring 
a  high  degree  of  skill.  Opmeer  ind  Krol  (20)  found  that  Increases  In  heart  rate  and  decrease  In  heart 
rate  variance  matched  the  predicted  order  of  Increasing  difficulty  of  four  phases  of  flight,  namely 
baseline,  level  flight,  take-off,  and  approach.  When  pilots  were  required  to  fly  realistic  flight  plans 
In  a  simulator,  the  same  relative  changes  were  found.  They  found  heart  rate  variance  to  be  a  more 
sensitive  measure  than  heart  rate  alone  and  they  concluded  that  heart  rate  variance  appeared  to  be  more 
related  to  ccgi  tlve  tasks  where  heart  rate  was  more  responsive  to  anxiety  Inducing  tasks. 

Roscoe  (21)  has  demonstrated  that  heart  rate  is  a  useful  tool  In  evaluating  pilot  workload  changes 
created  bv  n°w  aircraft  Instrumentation  and  advanced  control  systems.  Heart  rate  was  found  to  vary  as 
changes  In  weather  conditions  and  different  runways  created  more  stressful  landings.  While  these  Inflight 
cardiac  indicators  have  yielded  some  Information  on  cognitive  workload  and  stress  levels  experienced  by 
pilots,  laboratory  studies  In  which  the  stimulus  presentations  can  be  core  precisely  controlled  have 
been  much  mo- -e  successful  In  relating  these  Indices  to  performance  and  workload. 

The  normal  re  _ing  heart  rate  exhibits  a  relatively  large  degree  of  beat  to  beat  Irregularity  (HRV) 
referred  to  as  sinus  arrythmia.  Ettema  and  Zltlhuis  (22)  for.nd  that  sinus  arrythmia  was  significantly 
depressed  and  heart  rate,  blood  pressure  and  respiration  rate  were  significantly  Increased  as  workload 
Increased.  They  concluded  that  this  effect  Is  due  to  a  change  In  both  the  breathing  pattern  and  a  rise 
In  vagal  tone  and  sympathetic  nervous  activity  Induced  bv  the  mental  load.  Boyce  (23)  found  essentially 
the  same  increase  for  heart  rate  and  decrease  in  HRV  for  increasing  mental  loads.  A  series  of  studies 
by  Thackray  (24)  has  shown  HRV  to  be  a  useful  measure  tor  separating  rest  perolds  from  mental  work  periods 
on  a  variety  of  tasks.  Using  a  two  dimensional  compensatory  pursuit  tracking  task,  he  found  that  heart 
rate  variance  along  with  heart  rate,  blink  rate,  respiration  rate,  respiration  period  variability,  and 
skin  conductance  were  all  capable  of  differentiating  the  rest  period  from  the  work  periods.  In  a 
simulated  radar  control  task,  heart  rate  variance  was  found  to  be  higher  for  subjects  reporting  high 
boredom.  In  addition,  the  performance  of  the  subjects  in  the  higher  boredom  group  also  significantly 
declined  over  the  rest  period.  This  would  suggest  that  HRV  reflects  a  level  of  attentiveness  which  is 
related  to  overall  performance  capability. 

A  fairly  comprehensive  view  of  the  relationship  between  cardiac  Indicators  and  performance  has  been 
stated  in  the  broader  framework  of  arousal  theory.  It  is  known  that  the  level  or  performance  quality  is 
related  to  the  degree  of  arousal  or  activation  level  of  the  operator  in  terms  of  an  inverted,  U-shaped 
function  which  implies  that  an  optimal  level  of  activation  will  produce  maximum  performance  capability. 
This  in  turn  is  related  to  the  reticular  activating  system  which  in  effect  mediates  the  sleep/wakefullness 
dimension.  This  of  course  is  related  to  Increasing  levels  of  fatigue.  Heart  rate  can  be  expected  to 
decrease  as  the  subject's  level  of  arousal  falls  or  to  Increase  as  extra  effort  is  put  forth  to  stay 
awake.  The  seemingly  paradoxical  increase  In  heart  rate  with  fatigue  is  normally  seen  with  physical 
exertion  as  well,  where  heart  rate  continues  to  Increase  under  vigorous  exercise  up  to  the  point  of  the 
collapse  of  the  organism.  Thus,  the  task  demands  of  the  systems  operator  job  must  be  teken  into  account 
if  one  is  to  predict  the  arousal  level  of  a  long  duration  flight.  Corcoran  (25)  attempted  to  separate 
the  concept  of  arousal  from  task  demand  by  requiring  minimal  activity  from  subjects  during  a  60-hour 
period  without  sleep.  In  this  case,  both  heart  rate  and  performance  on  an  unarouslng,  nonphysical, 
30-minute  vigilance  task  fell  consistently.  He  argues  that  performance  will  follow  the  Inverted  "U" 
previously  described  with  decreasing  arousal,  nnd  that  arousal  will  fall  with  lack  of  sleep  or  Increased 
fatigue,  but  the  effort  to  remain  awake  which  is  what  is  being  measured  by  physiological  Indicators  will 
be  a  function  of  task  demand  and  subjective  motivation  to  remain  awake. 

Extrapolating  from  these  research  findings,  the  fc1 lowing  changes  In  heart  rate  and  heart  rate 
variances  can  be  predicted  for  long  duration  flights.  First,  heart  rate  and  heart  rate  variance  would 
tend  to  Increase  with  moderate  levels  of  fatigue.  With  very  high  levels  of  fatigue,  heart  rate  would  be 
expected  to  fall  and  heart  rate  variance  to  Increase  still  further.  We  would  also  find  that  tasks  which 
created  greater  levels  of  arousal  because  of  their  complexity  or  the  amount  of  concentration  required 
would  be  initially  more  resistant  to  fatigue  effects.  From  this  we  can  hypothesize  that  straight  and 
level  periods  of  flight  requiring  minimal  control  Input  and  instrument  monitoring  should  show  greater 
performance  decrement  with  fatigue  than  periods  when  maneuvers  must  be  performed.  Heart  rate  should  be 
higher  and  heart  rate  variance  should  be  lower  as  the  arousal  value  of  the  task  increases.  Tasks 
requiring  maximum  levels  of  Information  and  concentration  should  show  least  performance  decrement  and 
greatest  heart  rate  Increases  and  greatest  heart  rate  variance  decreases. 

Thus,  we  appear  to  be  at  a  point  where  the  important  pilotage  aspects  of  Information  processing, 
decision  making,  pattern  recognition  and  so  forth,  are  the  important  task  variables  and  cardiac  indicators 
are  one  of  the  Important  measures  of  workload,  fatigue,  and  stress  relative  to  the  man-machine  system. 
However,  it  would  be  a  mistake  to  focus  upon  single  physiologic  variables.  We  have  the  present  capability 
to  collect  and  evaluate  multiple  physiologic  variables  and  weigh  them  by  means  of  regression  analysis  so 
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as  co  invast:  igate  whether  meaningful  physiologic  profiles  can  identify  specific  reactions  to  specific 
aspects  of  workload,  fatigue  or  stress. 

While  we  are  considering  the  present  state  of  the  art  for  both  ground-baaed  and  inflight  physiologic 
measurements,  some  exciting  breakthroughs  are  on  tho  horlson  which  may  allow  us  to  measura  end  utilise 
cortical  indicators  of  dynamic  brain  activities  including  decision  making  and  information  processing. 

But  first,  we  shall  explore  how  we  arrived  at  the  position  that  information  processing  activities  relative 
to  required  Information  Input  it  a  vital  consideration  in  tiie  evaluation  cf  the  aen-machlno  Interface. 
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It  la  known  that  men-wichine  ayataaa  requlra  cartaln  kind*  of  oparator  akllla  and  Involve  specific 
kinds  of  tasks  whether  they  are  ground  baaed,  air-borne  or  In  apace.  Viewing  the  development  of  aviation 
fro*  Its  infancy  through  current  operational  aircraft,  airborne  weapons  systems,  and  space  systems  we 
see  a  remarkable  accelerated  development  of  automation.  With  this  development,  there  has  been  a  shift 
In  the  nature  of  the  Job  performed  by  the  man  In  this  assembly  of  man  and  machine.  In  general,  piloting 
Is  really  store  like  "machlnemnnshlp , "  with  the  number  of  subsystems  which  the  pilot  mus':  control,  the 
number  of  cockpit  displays  and  other  informational  Inputs  as  well  as  the  Increased  conmunlcetlnne  load, 
all  contributing  to  a  tremendous  Increase  in  workload  (1).  The  term  workload  Is  a  soar  what  ambiguous 
concept  that  can  be  defined  In  many  ways.  We  feel  tltat  workload  encompasses  the  concepts  of  performance, 
fatigue,  and  straas,  any  one  of  which  can  be  defined  In  term*  of  the  other.  Keeping  In  mind  the  pilot's 
function  as  a  systems  monitor,  wherein  he  Initiates  occasional  cotsaands  to  the  system,  we  know  that  the 
pilot  will  assume  actual  control  of  the  system  only  at  Intervals  agalnat  a  background  of  activity  at  a 
lower  level.  Thus,  we  have  a  highly  variable  work  rata  altuatlon  and  our  Initial  concern  le  whether  or 
not  the  intervals  of  low  activity  might  alter  the  efficiency  of  the  operator  when  he  Is  required  to 
assume  coomumd  or  exercise  control  over  the  system.  In  our  first  attack  on  this  problem  wo  used  four 
different  workload  levels  from  which  the  subjects  want  Into  a  period  of  overload.  The  subjects  In  this 
study  were  used  in  a  single  session,  matchsd  group  design.  Thara  wars  a  total  of  20  subjects,  five  in 
each  of  4  load  levels.  We  obtained  pronounced  deer amenta  In  psrformance  during  overload  after  successively 
lower  vork  load  levels.  Unfortunately,  In  spite  of  the  matching  there  were  some  differences  betveen 
groups  of  subjects  on  Initial  or  baaeline  proficiency,  therefore,  we  felt  that  this  initial  exploratory 
study  was  not  an  adequate  evaluation  of  the  problem.  What  we  needed  was  a  repeated  session  design  using 
each  subject  as  his  own  control.  With  this  refinement  In  a  follov-on  experiment,  we  found  no  differences 
in  proficiency  related  to  different  base  work  rates.  This  confirmed  British  studies  on  speed,  that  is 
signal  rate  stress,  wherein  the  effects  obtainad  ara  function  only  of  the  immediate  operator  load  eed 
a-.-e  Independent  of  the  characteristic  of  the  preceding  task  load  levels.  So  we  found  that  the  system 
operator  works  at  a  steady  systematic  rate  independent  of  the  more  variable  rate  of  signal  onset.  The 
operator  tends  to  ignore  a  rapid  onset  of  signals,  preceding  in  a  methodic  fashion  to  work  on  each 
subtask  as  he  gets  to  It.  We  liken  this  smooching  function  to  the  strategy  of  "queing"  proposed  by 
Mill-r.  In  this  strategy  the  operator  assigns  each  new  input  to  a  kind  of  conceptual  list  of  responses 
to  \e  made  whan  he  gets  to  them.  We  looked  for  Miller's  other  adaptive  strategies  which  he  called 
"filtering"  (Ignoring  some  signals  In  order  to  process  the  remaining  more  effectively)  and  "two-handed 
operation."  Instances  of  filtering  co-id  not  be  identified  and  two-handed  operation  occured  only 
infrequently;  however,  this  does  poae  the  question  as  to  what  conditions  cause  or  promote  the  use  of 
such  strategies  (2) . 

This  Initial  study  was  reported  in  1961  but  in  the  meanwhile  we  became  involved  in  evaluating  system 
operator  performance  factors  in  the  School  of  Aerospace  Medicine’ e  space  cabin  simulator.  In  evaluating 
the  operator  data,  we  reported  the  possible  "energizing"  effect  of  an  intlal  high  signal  rate  period  had 
on  a  subsequent  period  of  very  low  signal  rates.  We  alfco  felt  that  signal  rate  might  be  a  way  of 
manipulating  both  duty  time  and  diurnal  variables  (3) .  With  the  Idea  that  performance  decrement  was  not 

specifically  time-anchored,  but  more  of  an  Immediate  or  instantaneous  product  related  to  signal  rate,  we 

continued  to  gather  data  In  the  apace  cabin.  The  next  series  of  flights  explored  a  reversal  of  day/nlght 
operating  times.  Here  we  again  found  that  signal  rate  was  a  primary  factor  In  performance,  with  marked 
decrement  at  low  signal  rates  below  those  of  119  per  hour.  This  effect  is  attenuated  by  the  day /night 
cycle  in  that  performance  decrement  la  not  as  great  when  the  low  signal  rate  periods  occurred  during  the 
day  (4). 

At  this  point  the  requirement  for  evaluating  special  mission  personnel  Including  astronaut  candidates 
led  to  the  Interesting  concept  of  task  Induced  stress.  Here  competing  tasks  were  used  In  a  manner  so  as 
to  cauae  the  operator  to  psychologically  internalize  the  task  stress,  rather  than  to  attribute  his  obvious 
performance  deteriment  to  the  task  Itself.  Aside  from  the  problems  of  selection  and  evaluation  the 
results  of  our  attempt  to  Induce  this  kind  of  stress  show  that  the  competing  task  situation  produced 
significant  task  stress  which  could  be  used  to  access  the  relative  adaptiveness  of  the  individual.  In 
other  words,  the  selected  group  was  better  able  to  perform  and  was,  therefore,  less  susceptible  to  the 
slgnal/nolse  amblbulty  produced  by  the  task  and  less  bothered  by  the  Induced  task  stress  (5). 

A  later  overview  of  all  of  the  SAM  space  cabin  flights  was  aimed  at  evaluating  Information  input  as 

a  factor  in  crew  performance.  We  might  have  called  the  study  "signal  rate  revisited".  In  brief,  we  were 

able  to  show  that  a  constant,  fairly  high  level  of  signal  rate  (500  signals  per  hour)  resulted  in  a 
rather  remarkable  Increase  In  operator  performance  compared  to  low  and/or  variable  work  loads  (6).  Thus, 
we  have  placed  work/rest  cycles,  diurnal  variations,  etc..  In  the  role  of  secondary  or  even  more  remote 
factors  in  human  performance,  and  we  are  left  with  Information  input  as  a  critical  variable.  We  can 
Infer  that  any  factor  affecting  man's  ability  to  process  information;  that  is,  fatigue,  drugs,  stress 
(physical  and  psychologic),  etc.,  will  be  reflected  in  decremented  performance. 

One  easily  recognized  confounding  factor  in  the  work  load/performance  area  is  fatigue.  Ordinary 
fatigue  has  ne;-er  shown  up  as  a  significant  factor  in  the  space  cabin  studies.  However,  operational 
requirements  often  Impose  aircrew  problems  related  to  change  of  sleep  cycles,  early  awakening,  and  so 
forth,  in  the  face  of  Increased  pilotage  demands.  At  times,  tactical  demands  have  raised  the  question 
of  preflight  and/or  ln-flight  pharmacologic  support.  One  such  demand  Indicated  the  use  of  preflight 
sleep  induction  and  inflight  arousal  via  medications.  Two  research  studies  were  designed  to  test  the 
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•f f acts  such  medication  sight  have  on  performance.  Kara,  workload  vac  an  approximation  of  tha  tactical 
alaalon.  Performance  vaa  aaaaurao  ualng  tha  Hult {-dimensional  Pursuit  Taat  developed  by  tha  USA? 

School  of  Aavoapaca  Med i: ins  years  ago  a«  an  aid  to  pilot  aalactlon.  Preselection  criteria  and  a 
paychologic  teat  battery  were  used  aa  pradlctora.  Heart  rate  and  reaplratlon  were  also  monitored.  Tn 
this  instance,  physiologic  monitoring  and  paychologic  tasting  did  not  ravaal  nor  predict  any  systenatlc 
changes  related  to  the  drug  treataenta. 

The  drug  treataenta  Involved  the  administration  of  secobarbital  (three  grains)  with  the  "in  fllg.it" 
administration  of  d-aapethaalne  (5  nllllgraln*)  with  appropriate  controls.  The  results  indicated  a 
hangover  effect  of  three  grains  of  secobarbital  seen  at  the  start  of  the  mission  10  hours  later  and 
still  prevent  at  the  end  of  the  mieeio'  11  hours  later.  The  effects  of  d-cap*thanlne  are  decreased  in 
individuals  taking  secobarbital  (7) . 

A  follow-on  study  using  only  14  grains  of  secobarbital  showed  no  apparent  psychoaotor  hangover  (8) . 
While  these  kind  of  in-laboratory  studies  era  needed,  the  cost  of  doing  more  than  approximating  the  task 
structure  and  workload  requirements  are  uaually  prohibitive.  However,  we  feel  that  the  increased  necessity 
to  consider  the  use  of  pharmacologic  agents  by  aircrew  members  will  dictate  more  studies  relating  to 
pilot  performance.  What  the  laboratory  lacks  is  some  comparative  standard  of  laboratory  teak  or  task 
system  as  it  relates  to  accaptable  performance  standards  for  actual  aircraft  pilotage.  In  spite  or'  the 
long  history  of  laboratory  testing,  wa  still  cannot  answer  the  queetlon  "Will  this  particular  drug,  or 
Instrument,  or  device  impair  or  enhance  the  pilot's  ability  to  perform  his  required  duties?"  Theoretically , 
it  should  be  possible  to  state  with  scientific  confidence  that  performance  on  laboratory  task  "X"  at  a 
certain  level  indicates  that  pilotage  under  tha  experimental  conditions  being  evaluated  would  be  difficult, 
dangerous  or  impossible.  We  have  not  yet  developed  such  criteria  which  can  be  applied  to  the  airborne 
human.  However,  ws  try! 

Given  the  strong  evidence  of  the  critical  nature  of  the  relationship  between  Information  processing 
ability  and  aircrew  performance,  perhaps  wa  should  make  a  dedicated  effort  to  evaluate  information 
processing  ability  as  our  laboratory  task  "X"  and  compara  thaaa  raaulta  with  simulated  aircraft  pilotage 
performance. 
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A  classification  schema  la  praaanted  which  suzmarlzes  a  survey  and  analysis  of  aircrew  workload 
asiessmant  techniques  relevant  to  Inflight  teat  and  evaluation  considerations,  Two  dimensions  consisting 
of  universal  operator  behaviors  and  workload  assessment  methodologies  were  used  in  thn  classification 
schaaa.  The  universal  operator  behaviors  were  classified  according  to  the  Berliner,  Angell,  and  Shearer 
(1964)  categories  including  perceptual,  sediational,  communication,  and  motor  processes;  wheras  tha  work¬ 
load  assessment  mathodologiaa  ware  cataloged  into  28  procedures  under  the  general  categories  of  subjective 
opinion,  spars  mental  capacity,  primary  task,  and  phyr iological  measures.  An  applicability  matrix  based 
on  this  classification  scheme  is  presented  which  suzmarlzes  existing  research  on  workload  assessment 
methodologies,  and  a  bibliography  of  over  400  relevant  references  is  provided  as  an  appendix  to  this  paper. 
Procedures  era  described  whereby  this  matrix  can  be  used  an  a  guide  for  selecting  candidate  aircrew  work¬ 
load  assessment  measures  for  Inflight  evaluation.  A  brief  overvlev  of  the  various  workload  assessment 
techniques  is  presented  along  with  a  set  of  critical  criteria  that  need  to  be  considered  in  evaluating  the 
feasibility  of  these  sMasures  for  in-flight  anvironments.  It  was  concluded  that  no  one  single  technique 
can  be  recommended  as  tha  definitive  measure  of  operator  workload,  but  the  resulting  classification  achame 
and  applicability  matrix  can  aid  the  investigator  in  choosing  among  presently  available  techniques. 

INTRODUCTION 

One  need  only  compare  the  cockpit  of  a  modern  jet  fighter  to  its  World  War  II  predecessor  to  appreciate 
the  dramatic  increase  in  cockpit  complexity.  Technological  advances  during  the  past  30  years  have  resulted 
in  sophisticated  avionics  and  weapons  delivery  subsystems  which  are  available  to  aid  the  aircrew  in  com¬ 
pleting  a  specified  mission.  The  ultimate  mission  success  of  today's  modern  fighter,  however,  still  rests 
on  a  common  factor  present  in  its  World  War  II  counterpart.  This  factor  la  the  human  operator.  To  be  an 
effective  weapon,  the  modern  fighter  with  all  its  advanced  sensors  and  avionics  must  be  compatible  with 
the  capabilities  and  limitations  of  the  aircrew  operator. 

During  the  design,  development,  and  teat  and  evaluation  of  any  new  aircraft,  care  must  be  taken  that 
the  new  system  does  not  place  unreasonable  demands  on  the  aircrew  by  overwhelming  them  with  too  much 
information  and  too  little  time  to  process  that  information.  Such  considerations  are  often  characterized 
as  assessing  the  mental  workload  of  the  system  operator. 

When  one  reviews  the  research  literature  pertaining  to  mental  workload,  two  conclusions  are  readily 
apparent.  Namely,  there  is  no  single,  agreed  upon  definition  of  men'll  workload,  and  there  is  no  single, 
universal  metric  of  it.  Mental  workload  is  a  theoretical  construct,  and  as  such,  might  best  be  defined 
operationally.  Clearly,  it  is  related  to  factors  such  as  operator  stress  and  effort,  but  these  concepts 
also  require  operational  definitions.  Relsitig  (1972)  provides  an  excellent  overflew  of  the  difficulties 
and  complexities  involved  in  defining  and  measuring  workload. 

Rather  than  provide  a  single  definition,  one  must  consider  tho  various  operational  definitions  used 
in  measuring  operator  mental  workload.  The  systems  engineer,  for  example,  may  emphasize  operational 
definitions  based  on  time  available  to  perform  a  task.  Psychologists  tend  to  emphasize  the  Information 
processing  aspects  of  mental  workload  and  operationally  define  it  in  terms  of  measures  related  to  channel 
capacity  and  residual  attention.  Physiologists  on  the  other  hand,  emphasize  considerations  of  operator 
stress  and  arousal. 

Purpose 

The  Impetus  for  this  report,  stemmed  from  a  selective  annotated  bibliography  of  83  references  which 
represent  potential  measurement  techniques  for  assessment  of  operator  vorkload  in  operational  environments 
(Schiflett,  1976).  This  annotated  bibliography  categorized  the  various  methods  in  terms  of  general  ref¬ 
erences,  system  analysis,  subjective  techniques,  psychomotor  performance,  information  processing,  physio¬ 
logical  measures,  and  combined  methodologies .  Schiflett  concluded  that  the  majority  of  the  methods  were 
developed  for  use  in  the  design  stage  of  aircrew  systems,  thereby  making  them  difficult  and/or  Impractical 
to  use  in  the  later  stages  of  the  operational  test  and  evaluation  environment. 

This  project  was  undertaken  to  provide  a  more  comprehensive  survey  and  analysis  of  the  presently 
available  workload  assessment  methodologies  and  was  specifically  directed  toward  the  flight  test  and 
evaluation  environment. 

Appioach 

To  accomplish  the  goals  of  this  project  a  comprehensive  search  of  the  scientific  literature  was  con¬ 
ducted  including  books,  scientific  Journals,  technical  reports,  and  proceedings  of  technical  meetings. 
Computerized  information  retrieval,  library  searches  and  direct  contacts  with  the  scientific  e disunity 
were  used  to  locate  relevant  documents.  Given  the  large  pool  of  potential  documents  obtained  by  these 
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combined  March  procedures,  it  vas  necessary  to  adapt  a  aat  of  general  and  apaclflc  criteria  for  Inclusion 
of  a  reference  In  the  final  bibliography  of  over  400  references  appended  to  thla  paper.  Details  on  the 
search  procedures  ss  well  as  selection  criteria  are  provided  in  Wlervllle  and  Wllligea  (1978). 

following  the  selection  of  the  appropriate  workload  literature,  a  user-oriented  classification  acheae 
which  coablned  workload  methodology  with  universal  aircrew  behaviors  vaj  used  to  generate  a  catalog  of 
presently  available  workload  assessment  techniques.  Specifically,  thla  paper  provides  a  description  of 
this  classification  schema  and  details  the  use  of  this  acheae  for  the  selection  of  potential  measures  of 
workload.  In  addition,  an  overview  of  the  resulting  catalog  of  methodologies  la  presented. 

CLASSIFICATION  SCHDtt 

An  Important  prerequisite  to  developing  a  catalog  of  Methodologies  pertinent  to  operator  workload  in 
a  flight  environment  la  a  comprehensive  classification  scheme.  This  scheme  Is  necessary  to  form  a  basis 
for  selecting  documents,  to  classify  tha  citations  listed  in  the  bibliography,  and  to  make  possible  the 
construction  of  a  usable  analysis  catalog.  Consequently,  the  resulting  classification  schaaie  is  central 
to  a  meaningful  analysis  of  the  workload  literature  applicable  to  aircrew  considerations. 

One  dllansu  that  must  be  resolved  in  developing  a  classification  ucheme  is  that  of  providing  a  scheme 
with  a  meaningful  organisation  of  existing  workload  assessment  methodologies.  A  second  dilemma  centers 
around  providing  a  classification  human  operator  behaviors  which  are  related  to  eircrew  performance  so 
that  accurate  implications  can  be  drawn  from  .he  vast  amount  of  workload  research  that  was  not  conducted 
in  a  specific  aviation-related  context.  To  solve  these  dilemmas,  the  selected  scientific  literature  was 
classified  according  to  both  the  universal  operator  behaviors  present  in  aircrew  missions  as  well  as  the 
specific  workload  methodologies. 

Universal  Operator  Behaviors 

The  range  of  operator  behaviors  and  their  taxonomy  have  been  Investigated  for  several  years.  These 
behaviors  hive  been  used  to  obtain  an  understanding  of  what  functions  an  operator  performs  in  a  system 
and  as  a  basis  for  task  analysis.  One  widely  used  listing  of  operator  behaviors  was  developed  by  Berliner, 
Angell,  and  Shearer  (1964).  This  approach  breaks  operator  behavior  into  four  major  processes  (perceptual, 
medlational,  communication,  and  motor)  as  shown  in  Table  1.  These  four  major  processes  are  further  sub¬ 
divided  into  seven  activities  and  then  into  47  mutually  exclusive  operator  behaviors.  Because  the  terms 
used  in  this  scheme  are  orthogonal,  this  classification  can  be  expected  to  yield  good  agreement  among 
investigators  in  determining  specific  behaviors  for  a  specific  aircrew  problem.  Consequently,  the  Bar.1  ner, 
et  el.  (1964)  approach  was  uaed  to  classify  operator  behaviors  in  this  report.  To  facilitate  referencing 
to  this  classification,  a  graduated  numbering  scheme  as  listed  in  Tabic  1  la  used  throughout. 

Workload  Methodologies 

The  second  dimension  of  classification  Is  the  specific  list  of  available  methodologies  that  are 
potentially  applicable  to  aircrew  workload  assessment.  The  literature  on  workload  Is  so  diverse  that 
categorisation  on  the  part  of  the  reader  of  this  literature  is  almost  intuitive.  It  Is.  however.  Important 
to  select  a  categorisation  which  groups  the  various  workload  techniques  in  a  logical  way,  so  tl*at  conflicts 
and  discrepancies  on  workload  concepts  are  minimised. 

The  taxonomy  of  workload  methods  that  evolved  from  the  documents  reviewed  was  found  to  be  particularly 
useful  and  logical.  This  listing  of  methodologies  is  presented  In  Table  2  along  with  a  graduated  numbering 
designation.  Basically,  the  various  methods  are  grouped  into  four  major  categories  (subjective  opinion, 
spare  mental  capacity,  primary  task  assessment,  physiological  measures)  which  are  further  subdivided  into 
28  individual  techniques. 

Literature  Classification 


The  resulting  two-dimensional  classification  scheme  used  the  numerical  designations  of  workload 
methodologies  given  in  Table  2  with  a  subset  of  the  universal  operator  behav’oro  given  In  Table  1.  Early 
in  the  classification  of  documents  according  to  this  two-dimensional  analysis  it  became  evident  that  the 
scientific  workload  literature  was  addressed  primarily  to  overall  human  performance  as  compared  to  specific, 
detailed  aspects  of  performance.  Consequently,  the  literature  reviewed  coula  be  classified  only  according 
to  the  four  major  processes  and  seven  activities  shown  in  Table  1  Instead  of  the  47  mutually  exclusive 
behaviors.  Even  at  thla  less-refined  level  of  snalysis,  classification  of  the  literature  according  to  the 
operator  behaviors  dimension  appeared  to  be  more  subjective  and  unreliable  than  classification  on  the 
second  dimension  of  various  workload  methodologies. 

Applicability  Matrix 

Following  the  abstracting  and  classification  of  the  selected  documents,  all  the  references  were 
summarized  into  a  two-dimensional,  applicability  matrix  which  indicated  the  potential  use  of  each  of  the 
28  workload  asseesment  techniques  across  the  seven  universal  operator  behaviors.  A  four-point  rating  scale 
was  used  to  represent  the  amount  of  positive  research  evidence  supporting  the  potential  use  of  each  work¬ 
load  technique  for  each  operator  behavior.  These  ratings  Included: 

0:  Workload  method  is  unsuitable  for  assessing  workload  of  the  operator  behavior  cited.  No 

research  or  only  negative  research  support. 

1:  Workload  method  is  potentially  suitable  for  assessing  workload  of  operator  behavior  cited. 

Some  contradictory  evidence  exists;  further  research  is  needed. 


2: 
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Workload  method  Is  suitable  for  assessing  workload  of  the  operator  behavior  cited. 

No  contradictory  evidence  exists ;  further  research  la  needed. 

3:  Workload  method  la  suitable  for  assessing  workload  of  the  operator  behavior  cited. 

No  contradictory  evidence  exists.  Application  Is  proven. 

The  complete  applicability  matrix  resulting  from  this  analysis  Is  shown  In  Tcble  3. 

It  should  be  noted  that  the  ratings  In  Table  3  are  based  on  all  the  research  reviewed  and,  as  such, 
represent  data  collected  in  laboratory  simulator,  field,  flight  simulator,  and  flight  teat  environments. 
This  was  done  to  provide  an  overview  of  all  the  available  data  so  as  to  suggest  potentially  applicable 
techniques  for  the  aircrew  test  and  evaluation  environment.  Conceivably,  none  of  the  data  used  for  a 
particular  rating  was  from  the  flight  test  environment.  Table  3,  therefore.  Is  not  totally  suggestive 
of  overall  ratings  of  research  supporting  the  use  of  a  technique  In  the  flight  test  environment  (research 
of  this  type  is.  In  fact,  quite  limited);  rather  it  merely  suggests  a  potentially  applicable  approach.  To 
complete  the  evaluations  for  posdble  selection  of  a  workload  asseasment  technique  In  the  flight  test  area, 
one  must  carefully  consider  the  critical  criteria  for  selection  as  well  as  the  detailed  evaluation  of  each 
technique.  Nevertheless,  considerable  judgment  on  the  part  of  the  authors  was  necessary  In  several  cases 
In  arriving  at  a  rating. 

SELECTING  A  WORKLOAD  ASSESSMENT  METHODOLOGY 


The  literature  summarized  In  Table  3  could  be  used  for  a  variety  of  purposes.  For  example,  cells 
resulting  In  0  or  1  ratings  could  suggest  areas  for  additional  methodological  research.  Of  primary 
Importance,  however,  is  the  use  of  the  classification  scheme  and  resulting  applicability  matrix  as  an  aid 
In  the  selection  of  a  workload  assessment  methodology  for  aircrew  flight  test  and  evaluations. 


Steps  In  Selecting  a  Method 

The  Information  summarized  in  the  applicability  matrix  presented  In  Table  3  as  well  as  the  complete 
catalog  description  of  workload  estimation  techniques  presented  by  Wierwllle  and  Wllliges  (1978)  can  be 
used  as  a  guide  in  selecting  a  world  onu  assessment  methodology  in  t^e  following  six  step  procedure: 

Step  1.  Specify  the  aircrew  problem  for  which  mental  workload  'a  to  be  evaluated. 

Step  2:  Perform  a  general  task  analysis  ■••'ug  specific  operator  behaviors. 

Step  3:  Using  the  workload  method  applicability  matrix  (Table  3),  calculate  workload  methods 
weighting.  Rank  order  the  methods. 

Step  4:  Select  the  first  H  methods  in  ranking.  Study  each  of  the  N  methods  in  the  workload 
methodology  literature  review. 

Step  5:  Select  the  method  to  be  used. 

Step  6:  Read  referenced  documents  and  plan  the  workload  measurement  experiment. 


The  first  atep  Is  to  define  the  particular  aircrew  problems  for  which  of  the  mission,  and  particular 
elrcrew  task.  The  second  step  la  to  relate  the  aircrew  problem  to  the  universal  operator  behavior  dimension 
shown  in  Table  1.  This  may  be  done  by  examining  a  task  analysis  which  useB  these  terms  or  by  having  the 
investigator  directly  assess  which  behaviors  are  required  of  the  aircrew  member  during  the  task.  With  the 
completion  of  Step  2,  the  aircrew  pj  olem  dimension  and  the  operator  behavior  dimension  have  been  cosq>ressed 
Into  a  single  dimension  of  specific  operator  behaviors  which  can  be  related  to  the  seven  universal  operator 
behaviors  of  the  applicability  matrix  (Table  3) . 


To  aid  In  the  completion  of  Steps  2  and  3,  a  worksheet  as  presented  in  Table  4  is  useful.  The  Investi¬ 
gator  checks  the  top  of  the  appropriate  columns  on  the  worksheet  of  the  universal  operator  behaviors  which 
are  germane  to  the  particular  aircrew  mission  as  determined  by  Steps  1  and  2.  This  essentially  applies 
equal  weightings  to  the  various  operator  behaviors  chosen.  Alternatively,  each  dimension  can  be  weighted 
according  to  the  Importance  attributed  to  each  operator  behavior  present  in  a  particular  mission.  For 
example,  searching  for  and  receiving  information  (1.1),  information  processing  (2.1),  and  cominlcation 
processes  (3.)  may  be  the  central  operator  behaviors  in  a  particular  miasion.  Rather  than  "checking" 
these  three  dimensions  on  the  worksheet ,  the  lover tl gator  determines  that  ccrmminl ration  processes  are 
perhaps  twice  as  Important  to  the  mission  as  the  other  two.  Consequently,  communication  processes  ere 
weighted  as  2  on  the  worksheet,  end  the  other  two  operator  behaviors  are  weighted  as  1. 

In  step  3,  the  matrli  of  Table  3  is  used  to  determine  the  applicability  rating  of  each  workload 
assessment  technique.  This  Is  done  by  entering  the  applicable  ratings  from  Table  3  on  the  worksheet  and 
adding  the  row  of  numbers  for  each  workload  technique.  If  the  "check"  approach  Is  used,  only  the  rating 
values  from  Table  3  ere  added  on  the  worksheet  for  each  row  and  placed  In  the  "SUM"  column  of  the  work¬ 
sheet  .  If  a  weighting  a; yroach  is  used,  the  weighting  Is  multiplied  by  each  applicability  rating  number 
of  the  corresponding  row;  and,  subsequently,  the  rows  are  added  and  placed  In  the  "SUM"  column  of  the 
worksheet. 


Step  4  Involves  the  rank  ordering  from  highest  to  lowest  score  for  each  workload  technique.  The 
techniques  with  the  highest  scores  are  then  selected.  These  N  workload  techniques  are  the  most  applicable 
for  the  particular  aircrew  problem  under  study.  Also  as  a  part  of  Step  4,  the  investigator  reads  the  work¬ 
load  catalog  summary  and  the  bibliography  pertinent  to  each  of  the  N  particular  techniques  that  had  the 
highest  scores. 


It  is  difficult  to  statr  beforehand  bow  large  w  ahould  be.  Moat  likely,  it  will  ba  between  3  and  S 
for  aoat  workload  problems.  However,  judgment  on  the  part  of  the  lnveatlgator  must  determine  the  value  of 
N. 


Once  the  lnveatlgator  hae  read  the  sumaary  of  each  technique,  It  ahould  be  poealble  to  aelect  the 
technique  that  la  to  be  uaad.  Thle  la  Step  5.  Obvlouely,  judgment  again  pleya  a  major  role.  More 
apeclflcally,  practical  aapecta  will  have  to  be  taken  Into  conalderatlon.  Comparative  difficulty  of 
lapleaentatlon,  coat  of  the  experiment,  and  ability  to  neat  apace,  weight,  and  power  requlramente  are  some 
cf  the  factora  Involved.  A  feasibility  matrix  for  the  selection  of  workload  methodologies  for  ln-f light 
environments  la  found  In  Table  6. 

Once  the  technique  la  selected,  the  Investigator  should  obtain  and  read  In  detail  the  documents  cited 
In  the  bibliography  relating  to  the  specific  technique.  This  will  Insure  that  available  Information  la 
used  In  conducting  and  carrying  out  the  workload  assessment  experiment.  Pitfalls  and  potential  misappli¬ 
cations  might  also  be  avoided. 

SAMPLE  APPLICATION 

In  this  section  the  procedure  for  selection  of  one  or  more  workload  techniques  will  be  demonstrated 
by  a  sample  problem.  After  a  brief  description  of  the  environment  associated  with  the  sample  problem, 
the  ateps  of  the  selection  procedure  will  be  described. 

Background:  The  SS-3  Operator's  Task 

The  SS-3  operator's  position  In  the  P-3  aircraft  is  one  of  control  and  usage  of  the  aircraft's  non- 
sonar  sensors.  Several  sensor  systems  are  available  to  the  operator,  and  the  corresponding  support  equip¬ 
ment  for  them  is  quite  complex. 

Tiie  SS-3  operator  communicates  with  the  remainder  ol  the  crew  using  an  open-line  intercom  that  Is 
common  among  the  entire  crew.  Main  communications  are  with  the  TACCO  (tactical  aircraft  coordination 
officer)  and  the  pilot;  however,  substantial  listening  to  the  "problem  being  worked"  la  also  performed 
by  the  SS-3  operator. 

The  SS-3  operator  Is  responsible  for  ESM  (electronic  support  measures) .  ESM  Is  essentially  the 
passive  evaluation  and  Identification  of  incoming  radar  signals.  The  corresponding  emitters  may  be  ships, 
aircraft,  or  ground-based  radars  and  may  be  either  friendly  or  hostile.  C.dlnarlly,  many  radar  signals 
are  impinging  on  the  aircraft.  With  the  aid  if  the  aircraft's  central  computer  and  the  MPD  (multipurpose 
data  display)  at  the  SS-3  position,  the  operator  must  sort  and  evaluate  them. 

The  SS-3  position  also  contains  the  MAD  which  Is  designed  to  provide  precise  location  information  on 
partly  or  fully  submerged  vessels  at  short  ranges.  The  system  determines  anomalies  in  the  earth's 
magnetic  field  resulting  from  large  amounts  of  magnetic  or  paramagnetic  material. 

In  some  updated  P-3  aircraft,  the  IRDS  (infrared  display  system)  has  been  added.  The  IKDS  operates 
at  Intermediate  ranges  between  those  of  the  radar  and  the  MAD.  It  provides  a  television-like  raster- 
scanned  image  to  the  SS-operator.  This  sensor  not  only  provides  directional  information  on  a  target  or 
contact,  but  also  provides  an  Infrared  (heat  sensitive)  picture  showing  details  of  platforms  such  as 
superstructures,*  rigging,  antennas,  snorkels  or  periscopes.  Positive  identification  of  the  contact  or 
target  can  often  be  made  on  the  basis  of  these  details. 

It  Is  Important  to  recognize  that  the  ESM  system,  the  RADAR,  the  IKDS,  and  Che  MAD  are  all  tied  Into 

the  aircraft's  central  computer  and  appear  In  one  form  or  another  on  the  MPD  before  the  SS-3  operator. 

A  large  portion  of  the  operator's  workload  involves  updating  the  Information,  selecting  modes,  and  per¬ 
forming  "evaluation"  operations.  The  operator  has  before  him,  numerous  cotmMod  and  data  entry  pushbuttons 
as  well  as  a  trackball  (usable  with  either  hand).  The  trackball  allows  the  positioning  of  cursors  and 

symbols  on  the  MPD  so  that  specific  coordinates  may  be  inputted  In  what  appears  as  an  analog  or  digital 

mode. 

Application  of  the  Procedure;  An  Example 

Step  1.  Aircrew  workload  problem  statement.  The  IRDS  is  being  installed  in  updated  P-3  aircraft. 

In  making  this  addition,  the  mental  workload  of  the  SS-3  operator  is  going  to  be  Increased.  It  Is  desired, 
therefore,  to  determine  the  workload  of  the  SS-3  operator  both  with  and  without  the  addition  of  IRDS. 

There  are  two  points  In  a  tactical  Intercept  mission  where  highest  workloads  may  be  presumed  to  occur: 
first,  when  the  SS-3  operator  Is  attempting  to  transition  from  the  radar  to  IRDS,  and  second,  when  the 
operator  la  attempting  to  Identify  the  target  using  the  IRDS  (having  previously  acquired  It).  For  purposes 
of  explanation,  discussion  will  be  limited  to  the  second  high-workload  condition. 

When  the  IRDS  la  in  use,  the  SS-3  operator  controls  the  position  of  the  sensor,  vectors  the  pilot  to 
the  target,  and  keeps  the  TACCO  appraised  of  details  becoming  visible  on  the  IRDS  display.  As  soon  as 
positive  Identification  Is  made  by  the  SS-3  operator,  the  TACCO  Is  informed. 

Depending  on  the  tactical  situation  the  SS-3  operator  performs  ESM  duties,  and  keeps  the  MPD  and 
aircraft  computer  updated  on  the  tactical  situation.  Furthermore  the  operator  listens  carefully  to  crew 
coMunicatlons  over  the  intercom. 

During  the  same  time  period,  If  the  aircraft  does  not  contain  an  IRDS,  the  SS-3  operator  vectors  the 
pilot  using  the  radar.  Responsibility  for  positive  Identification  Is  then  transferred  to  the  cockpit  crew. 
The  ESM,  MPD,  computer,  and  cooMinlcation  tasks  remain  essentially  the  sane  for  the  SS-3  operator  during 
thle  time  period. 
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Step  2.  Determination  of  operator  behaviors.  Since  the  higher  workloads  are  likely  to  occur  with 
the  IRDS  present  in  the  system,  and  alnce  a  coonon  workload  methodology  should  ba  used  for  both  IRDS- 
preaent  and  IRDS-ebssnt  cases,  the  IRDS-present  case  will  be  used  to  determine  operator  behaviors  and 
weightings.  In  the  IRDS-abcent  case,  only  slight  changes  would  occur,  having  to  do  with  target  Identi¬ 
fication. 

In  terms  of  the  Intercom  task,  the  universal  operator  behaviors  category  (Table  1)  la  3.  Communication 
Processes.  This  is  weighted  with  an  importance  of  4  on  a  scale  of  0  to  5,  where  0  is  "of  no  consequence" 
and  5  is  "absolutely  critical"  to  mission  succest.  Whether  the  SS-3  operator  la  verbally  vectoring  the 
pilot  or  providing  details  on  the  identification  to  the  TACCO,  the  Intercom  task  is  vary  important. 

The  ISOS  aspect  of  the  task  involves  calculations  of  vectoring  information  and  visual  discrimination 
of  details  in  the  scene.  These  two  aspects  should  receive  a  top  rating  of  5  because  the  mission  la  depend¬ 
ent  on  the  SS-3  operator's  abilities  at  directing  and  rapid  Identification.  The  task  consists  of  1.2 
Identifying  Objects,  Actions,  and  Events,  and  2.2  Problem  Solving  and  Decision  Making.  Continuous  tracking 
would  also  be  performed.  But,  because  slight  errors  In  pointing  the  IBDS  sensor  would  probably  not  harm 
Identification  (as  long  as  the  target  remained  in  the  field  of  view),  the  behavior  4.2  Complex/ Continuous 
Motor  Processes  could  be  given  a  weighting  of  3. 

For  most  situations,  the  MAD  would  not  yet  have  come  into  operation  in  the  scenario,  so  It  would  be 
assumed  that  it  is  not  part  of  the  task.  Similarly,  while  the  radar  might  be  operating.  It  would  probably 
only  be  used  as  a  back-up  (when  the  IRDS  Is  operating  and  target  acquisition  has  already  been  m»de) . 

The  ESM  system  would  continue  to  operate  and  to  provide  information  on  radar  emitters  In  the  area. 
Under  the  assumption  that  the  P-3  is  not  itself  under  attack,  the  SS-3  operator  would  relegate  ESM  tasks 
to  a  lower  priority.  The  examination  of  radar  contracts  would  primarily  Involve  1.1  Searching  for  and 
Receiving  Information,  2.1  Information  Processing,  and  4.1  Slmple/Discrete  Motor  Processes.  This  would 
be  given  a  priority  rating  of  2.  Obviously,  a  much  higher  priority  would  be  given  to  ESM  (probably  5) 

If  the  aircraft  were  under  attack. 

The  SS-3  operator  would  also  be  performing  data  Input  duties  to  the  MPD  and  computer  to  the  extent 
possible.  However,  these  aspects  would  be  of  a  bookkeeping  and  update  nature,  since  p  imary  communication 
would  be  via  the  Intercom.  Nevertheless ,  the  operator  would  perform  the  task  to  the  extent  possible.  It 
involves  1.1  Searching  for  and  Receiving  Information,  2.1  Information  Froceaslng,  2.2  Problem  Solving  and 
Decision,  llaklig,'  4.1  Slmple/Discrete  Motor  Processes  and  4.2  Complex/Contlnuoua  Motor  Processes.  When 
the  IRDS  Identification  task  la  being  performed,  MPD  and  computer  updating  night  have  an  Importance 
weighting  of  1. 

If  the  highest  priority  weighting  stated  above  la  used  for  each  operator  behavior,  the  weighting 
would  appear  as  shown  In  the  first  horizontal  line  of  numbers  of  the  worksheet  for  this  example,  as  shown 
in  Table  S. 

Step  3.  Workload  methods  weighting  and  rank  ordering.  Having  obtained  the  necessary  universal 
operator  behavior  weighting  for  the  specific  SS-3  operator  workload  problem.  It  becomes  possible  to  compute 
the  relative  weightings  of  workload  techniques  and  to  rank  order  them.  This  is  done  by  multiplying  each 
number  In  the  "Behavior  Check  (  )  or  Weighting"  row  of  Table  5  by  the  corresponding  number  In  each  row  of 
Table  3.  Each  Individual  product  Is  then  entered  in  Table  5  In  the  appropriate  workload  methodology  row 
and  operator  behavior  column. 

All  products  In  each  row  are  then  added,  and  the  sum  is  placed  In  the  right  hand  column.  The  workload 
methodologies  exhibiting  the  highest  sums  are  the  ones  most  applicable  to  the  SS-3/IRDS  problem. 

Step  4.  Selection  of  H  techniques;  study  of  the  techniques.  The  results  of  the  selection  procedure 
indicate  that  the  following  alx  techniques  (ranked  by  numerical  score  with  the  highest  first)  are  the  most 
appropriate  for  the  SS-3  operator  workload  problem: 

2.1.1  Task  Analytic;  Task  Component,  Time  Summation 

1.1  Opinion;  Rating  Scelea 

1.2  Opinion;  Interviews  and  Questionnaires 

2.2.1  Secondary  Task;  Arlthsmetlc-Loglc  (Nonadaptive) 

2.2.2  Secondary  Task;  Tracking  (Nonadaptive) 

4.1,8  Physiological;  Pupillary  Dilation 

Tbs  initial  selection  of  six  techniques  rather  than  some  other  number  Is  arbitrary.  However,  techniques 
having  scores  substantially  below  the  highest  ranking  score  are  not  likely  to  result  In  accurate,  reliable 
assessment  of  operator  workload,  because  the  corresponding  techniques  are  not  fully  proven. 

It  is  worth  noting  that  small  changes  In  the  weightings  of  Importance  of  the  unlverssl  operator 
behaviors  would  probably  not  have  changed  the  outcome  of  the  selection  procedure  up  to  this  point.  Most 
likely  the  same  six  weightings  might  have  resulted  in  a  different  set  of  techniques  being  selected, 
particularly  for  the  fourth,  fifth,  and  sixth  ranks. 

After  studying  the  six  techniques  mors  carefully  using  section  4  of  Wlervllle  and  Willlges  (1976),  it 
should  become  possible  to  select  one  or  possibly  two  to  be  Implemented.  As  a  means  of  carrying  the  example 
through  the  advantages  and  disadvantages  of  the  six  techniques  will  be  briefly  reviewed. 
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The  task-component ,  else  summation  technique  is  primarily  analytical.  However ,  It  could  be  easily 
adapted  to  the  T&E  environment  by  having  SS-3  operators  perform  each  aesl.l  segment  of  each  teak  .eparetely. 
These  could  be  timed.  Subsequently,  time  available  could  be  determined  from  the  mission  scenario,  end 
assessment  of  workload  determined.  The  apparent  drawbacks  to  such  a  technique  are  Its  complexity  and  the 
fact  that  SS-3  operators  may  be  capable  of  performing  simultaneous  tasks  because  of  their  high  skill  level. 

The  two  opinion  techniques  are  clearly  applicable.  It  is  probably  true  that  the  technical  training 
of  SS-3  operators  Is  sufficient  to  make  them  highly  reliable  judges  of  their  own  mental  workload.  The 
Investigator  would  have  to  present  and  specify  the  problem  carefully  so  that  the  operators  would  have  a 
clear  picture  of  what  Is  expected.  Because  of  their  high  level  of  motivation,  It  la  probable  that  accurate 
assessment  of  maximum  tolerable  workload  could  be  obtained. 

The  two  secondary  task  techniques  might  also  be  applicable.  Preference  ahould  probably  be  given  to 
the  arithmetic-logic  task,  because  the  operator  will  have  his  left  hand  in  use  for  the  trackball  and  his 
right  hand  In  use  fo*  he  ISDS  controller.  Introduction  of  yet  another  manual  control  for  tracking  would 
probably  cause  conges Lion  and  severe  Intrusion.  Even  the  arithmetic-logic  task  will  to  some  degree  cause 
congestion  because  the  operator  is  already  using  both  hands,  one  foot,  his  voice,  both  ears,  and  his 
vision  (with  at  least  two  displays).  If  at  all  possible,  the  secondary  task  should  in  some  way  be  Inte¬ 
grated  Into  the  present  task  through  programming.  Perhaps  the  ESM  contacts,  properly  attended  by  the 
SS-3  operator,  could  be  scored  as  a  secondary  task.  Since  the  operator  would  relegate  this  task  to  a  low 
priority  anyway  during  the  specified  scenario.  Instructions  to  the  operator  would  already  be  similar  to 
his  present  method  of  operating. 

The  technique  of  pupil  dilation  is  perhaps  the  least  proven  of  the  six;  yet  It  holds  promise.  The 
SS-3  operator's  station  in  the  aircraft  Is  already  somewhat  isolated.  A  curtain  can  be  drawn  around  the 
open  side  of  the  station,  and  the  side  window  can  be  blocked.  Consequently,  ambient  lighting  could  be 
maintained  constant.  A  small  video  camera  could  probably  be  Installed  at  the  upper  and  right  hand  comer 
of  the  MPD.  Alternatively,  a  commercially  available,  headaounted  pupillography  system  could  be  used. 

It  should  be  noted  that  gathering  of  pupil  dilation  information  Is  complicated  by  eyelid  droop  when 
the  observer  becomes  tired.  Normally  SS-3  operators  are  on  duty  for  six  to  twelve  hours.  Care  would 
therefore  have  to  be  taken  to  fly  short  missions  for  data  taking  purposes. 

Step  5.  Workload  method  selection.  It  Is  believed  that  any  of  the  Initial  six  methods  could  be  used 
to  assess  the  SS-3  operator's  workload.  Pinal  selection  becomes  a  matter  of  ease  of  implementation,  costs, 
and  other  matters  of  feasibilllty  as  Indicated  later  in  Table  4.  On  the  basis  of  these  factors  It  is  most 
likely  that  an  opinion  approach  could  be  most  rapidly  and  easily  Implemented.  It  would  therefore  be  the 
recommended  first  choice.  The  task  component,  time  summation  technique,  UBing  experimentally  derived  task 
element  times  would  provide  highly  quantitative  results.  Therefore,  it  would  be  a  good  second  choice. 
However,  a  great  deal  of  time  and  effort  might  have  to  go  into  the  experiment  and  the  data  analyaia. 

Step  6.  Study  of  documents;  planning  of  experiment.  Further  study  of  documents  referenced  in  the 
appendix  should  make  possible  the  construction  of  an  opinion  technique  that  has  all  the  desired  attributes 
for  the  particular  SS-3  problem  under  examination.  The  choice  of  rating  scales,  questionnaires,  interviews 
or  aoms  combination  thereof  would  have  to  be  made. 

Preliminary  planning  of  the  experiment  should  include  a  teat  of  the  technique  on  operators  who  would 
not  participate  In  the  later  data-taklng  session.  These  operators  could  aid  in  uncovering  confusion  terms 
In  questionnaires  or  rating  scales,  and  In  ironing  out  problems  of  terminology.  Instructions,  and  scoring. 

The  final  experimental  plan  ahould  be  such  that  the  experiment,  when  conducted,  will  yield  statisti¬ 
cally  significant  differences  in  experimental  conditions  if  In  fact  there  are  differences.  The  most 
prlxed  result  lo  significant  differences  In  workload  levels.  Under  these  conditions,  definite  conclusions 
can  be  drawn  regarding  workload. 

OVERVIEW  OF  WOkxLOAO  TECHNIQUES 

This  section  provides  a  brief  overview  of  the  various  workload  estimation  techniques  at  the  second 
level  of  classification  as  shown  In  Table  2.  Each  procedure  is  described  only  In  terms  of  its  theory  and 
background,  because  of  the  brevity  requirements  of  this  paper.  Hivever,  a  complete  description  of  method/ 
apparatus,  areas  of  applications  and  examples,  limitations,  and  suggested  ROT  &  E  follow-up  can  be  found 
in  Wierwllle  and  Wllllges  (1978).  To  provide  an  overall  evaluation  of  these  various  techniques,  certain 
critical  criteria  must  first  be  considered  In  the  inflight  environment. 


The  In-Flight  Aircrew  Workload  Problem 


Whenever  an  attempt  la  made  to  measure  worklosd  in-f light,  a  number  of  practical  considerations  become 
Important.  These  considerations  deal  primarily  with  the  difficulties  of  introducing  or  adding  anything 
to  the  cockpit  or  crew  station  environment.  These  practical  considerations  go  well  beyond  those  involved 
In  ground  simulation  and  have  far-reaching  ramifications  on  workload  technique  assessment. 


Physical  space.  In  moat  aircraft,  crew  positions  are  carefully  designed  to  take  maximum  advantage 
of  the  apace  available.  This  space  la  usually  limited  by  airframe  and  other  design  considerations  in  very 
complex  trade-offs.  The  introduction  of  any  device  having  substantial  sice  will  compromise  the  efficiency 
of  the  original  design.  Needed  controls  or  displays  may  be  obscured  or  made  inaccessible.  Crew  comfort 
might  also  ha  sacrificed  by  reducing  the  already  limited  freedom  of  movement  available.  Therefore,  for 
most  airborne  situations,  desirability  and  practicality  of  the  workload  measurement  equipment  increases  as 
the  physical  else  of  the  equipment  decreases.  Also,  there  Is  an  upper-hound  on  tolerable  else,  which  for 
some  situations  might  be  as  little  as  ona-elghth  cubic  foot.  However,  a  study  of  allowable  sice  could 
profitably  be  performed. 
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Obviously i  the  physical  sis*  consideration  is  not  as  severe  In  ground  simulation.  Usually  In  this 
case  a  way  can  be  found  to  "fit"  another  device  Into  the  simulation.  Since  the  workload  neanireeent 
apparatus  need  not  be  self-contained,  supporting  parts  can  be  "hung"  outside  the  simulated  crew  station. 

Portability  and  self-contalnnent.  In  general.  It  would  be  desirable  to  assess  workload  using  a 
single  smell  package  that  can  be  easily  added  to  the  crew  station.  Prototype  ond  operational  aircraft 
usually  do  not  Include  power  sources  and  telemetry  or  recording  equipment  for  optional  equipment  in  tha 
crew  station.  Furthermore,  present  practice  would  probably  not  permit  modifications  of  operational  air¬ 
craft  for  powering  and  recording  of  workload  measurement.  The  assumption,  therefore,  must  be  made  that  a 
workload  assessment  system  must  be  largely  self-contained  If  1?.  is  to  be  used  in-fll/jht. 

Intrusion  and  safety.  It  is  well  known  that  many  methoda  of  workload  measurement  tend  to  Intrude  on 
tasks  at  hand  (primary  tasks) .  An  aspect  of  Intrusion  that  must  be  considered  separately  because  of  its 
Importance  le  that  of  safety. 

Certain  types  of  flight  operations  are  l*i  themselves  critical.  Take-off,  J*oding,  ejection,  and 
any  other  type  of  system  failure,  are  examples  of  critical  operations.  Two  ty  >  of  safety-related 
intrusion  may  possibly  occur  through  introduction  of  workload  measurement  equ:  -nt:  obstruction  and 
distraction.  Obstruction  Involves  the  problem  of  having  an  extra  physical  obj.._c  vithin  the  spt.ee  needed 
to  deal  with  a  critical  operation.  Distraction  pertains  to  the  fact  that  the  workload  assessment  may 
draw  the  crew  member’s  attention  away  from  the  critical  situation.  Unless  backup  crew  stations  ere  avail¬ 
able,  It  may  be  Inadvisable  to  assess  workload  of  certain  critical  operations  In  flight  except  by  a 
posteriori  techniques  which  by  their  nature  do  not  Intrude. 

Data  transmission  or  recording.  It  Is  one  problem  to  design  a  feasible  workload  task  for  ln-fllght 
use~lt  la  yet  another  to  score  the  task  and  analyze  the  results.  There  appear  to  be  three  alternatives 
In  the  area  of  data  analysis: 

1.  Perform  ln-flight  analysis  and  record  the  processed  data  output  in  concise  form  for  later  use; 

2.  Record  or  store  the  unprocessed  data  for  later  playback  and  analysis;  and 

3.  Telemeter  or  otherwise  transmit  unprocessed  data  to  a  ground  station  for  recording  or  processing. 

Experimental  controls.  A  problem  that  may  arise  when  performing  ln-fllght  experiments  is  that  of 
obtaining  adequate  experimenter  control**.  The  investigator  or  experimenter  may  not  be  on  board  when  the 
workload  assessment  procedures  are  conducted.  Consequently,  radio  contact  may  have  to  suffice.  In  those 
cases  where  the  experimenter  remains  on  the  ground,  workload  assessment  should  be  obtained  by  a  system 
that  Is  procedurally  simple  to  operate.  Also  this  system  should  be  as  ''fudge-proof"  as  possible  so  that 
the  effects  of  biases  of  the  aircrew  members  are  minimized. 

Workload  assessment  integration.  Because  some  modem  aircraft  Incorporate  computer  graphic  displays 
with  substantial  computer  capability,  the  possibility  exits  that  certain  workload  assessment  techniques 
may  be  integrated  Into  crew  stations  through  software.  Existing  capabilities  or  near  future  capabilities 
may  be  such  as  to  permit  special  modes  of  operation  of  standard  displays  and  controls  that  would  permit 
workload  assessment.  Scoring  might  he  accomplished  by  the  on  board  computer  and  the  results  stored  in 
condensed  form  for  pest-flight  readout.  Not  all  methods  of  workload  assessment  may  be  suitable  for  this 
kind  of  Integration;  but  initially.  It  appears  that  certain  ones  would  be  applicable  A  feasibility 
study  of  the  programming  potential  of  new  aircraft  systems  for  workload  measurement  appears  to  be  a  fruitful 
area  of  research. 

In-flight  workload  assessment  summary .  In-flight  measurement  of  workload  represents  a  challenge  well 
beyond  that  of  ground  simulation.  Factors  such  as  physical  size,  weight,  intrusion  i elated  to  safety, 
portability,  and  experimental  control  become  extremely  important.  Techniques  that  work  well  on  the  ground 
may  therefore  prove  infeasible  for  ln-fllght  use,  particularly  during  critical  mission  phases  such  as 
take-off,  landing,  or  subsystem  failure  (degraded  mode).  Nevertheless,  newer  techniques  are  becoming 
available  that  can  eliminate  or  at  least  minimize  the  ln-flight  problems,  principally  the  inclusion  of 
workload  assessment  as  a  software  change  in  the  aircraft's  avionics  system  (using  the  existing  computers 
and  graphics  capabilities),  and  the  use  of  microprocessors  in  self-contained  miniaturized  modules  that 
perform  all  functions  Involved  in  workload  assessment. 

Table  6  provides  a  summary  of  the  seven  critical  criteria  used  to  evaluate  each  of  the  various  workload 
measurement  approaches  for  the  in-flight  environment.  This  matrix  provides  some  perspective  on  the  relative 
feasibility  of  Implementation,  provided  the  measurement  technique  could  otherwise  be  perfected.  Details  of 
these  feasibility  considerations  are  provided  in  the  descriptions  of  each  method  which  follow. 

WORKLOAD  TECHNIQUES  SUMMARY 

1.  SUBJECTIVE  OPINIONS 

Subjective  opinions  are  a  conmonly  used  measure  of  workload  In  flight  test  and  evaluation.  Often 
this  measure  is  used  in  conjunction  with  other  Indices  to  provide  a  broader  basis  for  evaluation  and 
comparison.  A  variety  of  techniques  exist  tor  gathering  subjective  opinions.  These  include  psychomet- 
rically  defined  rating  scales,  structured  questionnaires  with  dichotomous  or  multiple  choice  responses, 
open-end  questionnarles,  structured  interviews,  and  unstructured  interviews. 

In  workload  assessment  applications,  primarily  two  general  approaches  have  been  used.  The  more 
systematic  approach  deals  with  the  use  of  rating  scale  procedures  for  obtaining  pilot  opinions;  whereas, 
the  second  area  deals  with  less  structured  approaches  using  a  variety  of  interview  and  questionnaire 
procedures.  Often  the  terms  rating  scale  and  questionnaire  are  used  somewhat  interchangeably  in  the 
scientific  literature.  For  the  purposes  of  this  review,  rating  scales  will  be  used  for  procedures  which 
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represent  subject  opinions  gathered  by  devices  with  psychometric  seeling  properties,  and  questionnaires 
used  In  structured  Interviews  will  refer  to  procedures  that  are  not  based  strictly  on  scaling  considera¬ 
tions.  Consequently,  questionnaires  have  been  grouped  with  Interviews  for  the  purposes  of  this  review. 

1.1  Hating  Scales.  Over  the  last  twenty  years  such  work  has  been  dedicated  to  the  development  of  rating 
scales  for  assessing  the  handling  qualities  of  aircraft.  These  scales  ordinarily  contain  about  ten 
categories  with  descriptors  that  are  not  readily  subject  to  confusion.  The  most  widely  used  of  these 
scales  is  the  Cooper-Harper  scale  (1969) .  It  is  accepted  for  use  in  handling  qualities  work  and  is 
primarily  used  by  test  pilots.  The  descriptors  of  this  scale  pertain  to  the  "flyabillty"  of  an  aircraft. 
Even  though  the  scale  does  contain  some  reference  to  workload  the  descriptors  would  have  to  be  modified 
for  use  In  workload  applications.  If  this  Cooper-Harper  scale  were  used  for  workload  assessment  In  Its 
present  form,  the  assumption  must  be  made  that  handling  difficulty  and  workload  are  directly  related. 

Such  an  assumption  may  well  be  unwarranted. 

Recently,  some  research  has  been  directed  toward  the  development  and  evaluation  of  workload-specific 
rating  scales.  Comparisons  have  been  made  between  the  workload  measurements  obtained  from  rating  scales 
and  those  obtained  from  primary  task  performance,  secondary  tasks,  occlusion,  and  physiological  measures 
(Hicks  and  Wierwllle,  in  press).  Specifically,  the  rating  scale  proved  to  be  a  sensitive  measure  of  work¬ 
load  and  resulted  In  little  Intrusion  on  the  primary  task.  Additional  research  has  been  directed  toward 
developing  a  research-based,  conjoint  rating  scale  of  workload  for  the  F-18  aircraft  (O'Conner  and  Buede, 
1977;  and  Donnell  and  O'Conner,  1978)  which  was  a  direct  outgrowth  of  the  work  of  Helm  (1975,  1976a,  and 
1976b) . 

With  the  exception  of  the  conjoint  measurement  technique,  most  previous  approaches  have  failed  to 
follow  rigorous  psychometric  procedures  in  developing  workload  rating  scales.  Examples  of  the  use  of 
ratings  In  this  regard  can  be  given  both  for  flight  simulator  studies  (e.g.,  Johannsen,  1976;  Kreifeldt, 
Parkin,  and  Rothschild,  1976;  Murphy  and  Gurman,  1972;  and  Schultz,  Newell,  and  Whltbeck,  1970)  and 
flight  tests  (e.g..  Baker  and  Intano,  1974;  Helm,  1975,  1976a;  Lebacqz  and  Aiken,  1975;  and  Stackhouse, 
1973). 

1.2  Interviews  and  Questionnaires.  In  contrast  to  the  rather  rigorous  procedures  available  for  the 
development  of  rating  scales,  the  procedures  used  in  interviews  and  questionnaires  are  not  nearly  as 
structure'!.  Application  of  these  procedures  to  aircrew  workload  assessment  range  from  completely  open- 
ended  debrief  i,:tg  cessions  after  flights  (Soliday,  1965),  to  self-reporting  logs  of  stressful  activities 
(Soutendam,  1977;  Cantell  and  Hartman,  1967),  to  carefully  chosen  questionnaire  Items  (Stelnlnger  1977). 
Recent  work  by  Rohnert  (1977)  demonstrates  procedures  that  can  be  ei  ployed  In  using  questionnaire  develop¬ 
ment.  This  approach,  called  the  "Krgonlmic  Job  Description  Questlo....aire,"  was  developed  specifically  for 
workload  evaluations  of  air  traffic  control  activities. 

If  questionnaires  and  Interviews  are  used  in  an  unstructured  or  r~en-ended  way,  care  still  needs  to 
be  given  to  the  appropriate  topic  areas  and  questions  chosen  for  Inclusion.  If,  on  the  otbi.t  hand,  struc¬ 
tured  responses  are  used,  the  choice  of  response  Items  (e.g.,  dichotomous  or  multiple  choice)  should  be 
constructed  and  tested  much  in  the  same  manner  as  described  for  rating  scales. 

2.  SPARE  MENIAL  CAPACITY 

The  largest  body  of  research  data  dealing  with  the  measurement  of  human  operator  workload  is  concerned 
with  the  evaluation  cf  the  concept  of  spare  (residual  or  reserve)  mental  capacity.  This  con:ept  is  grounded 
on  the  fundamental  assumption  of  a  single-channel,  sampling  model  of  the  human  operator  (Knowles,  1963;  and 
Rolfe,  1973b).  The  approach  assumes  that  an  upper  bound  exists  on  the  ability  of  the  human  operator  to 
gather  and  process  information.  Spare  mental  capacity,  then,  is  the  difference  between  the  total  workload 
capacity  of  the  operator  and  the  capacity  needed  to  perform  the  task.  As  spare  mental  capacity  decreases, 
the  operator's  workload  Increases  until  a  point  of  overload  Is  reached.  At  this  point,  the  information 
processing  demands  of  the  task  exceed  the  operator’s  total  workload  capacity. 

A  variety  of  methods  and  procedures  have  been  developed  to  measure,  both  directly  and  indirectly, 
spare  mental  capacity.  In  addition,  a  great  deal  of  laboratory  research  data  exist  on  empirical  tests  of 
various  ramifications  of  the  single-channel  concepts.  For  example,  data  are  available  on  the  possibility 
of  multi-channel  processing;  procedures  for  switching  attention  among  channels;  various  points  of  conflict 
or  bottlenecks  in  the  human  Information  processing  channel;  and  variations  in  the  upper  limit  of  an  Indi¬ 
vidual's  mental  workload  capacity  due  to  factors  of  stress,  emotional  state,  fatigue,  and  effort.  Much 
of  this  human  performance  research  Is  suanarlzed  by  Kahneman  (1973)  and  will  cot  be  reviewed  at  this  time. 

Essentially,  three  general  methodological  approaches  have  been  advanced  for  measurement  of  workload 
using  the  generalized  spare  mental  capacity  paradigm.  These  approaches  include  task  analytic,  secondary 
task,  and  occlusion  procedures.  These  methods  are  presented  with  the  overall  caution  that  even  though  the 
underlying  single-channel,  sampling  model  assumptions  of  the  human  operator  Is  a  viable  concept,  it  is  not 
a  totally  unequivocal  hypothesis  in  terms  of  supporting  data. 

2.1  Task  Analytic.  Task  analytic 'methods  assess  spare  mental  capacity  by  using  mathematical/theoretical 
methods  from  systems  engineering.  The  data  base  used  in  these  technique,  is  most  often  obtained  through 
laboratory  and  simulation  experiments  rather  than  flight  tests.  Task  analytic  methods  assume  that  all 
task  components,  performed  serially,  require  specified  lengths  of  time  to  complete.  As  long  as  the  actual 
time  available  for  overall  completion  exceeds  the  sum  of  theoretical  time  durations  for  performing  the 
task  components,  spare  mental  capacity  exists.  However,  when  the  actual  time  available  is  insufficient, 
stress  and  task  overloading  occur.  Task  analytic  methods  consist  of  either  task  component /time  emanation 
computer  models  (Greening,  1978)  or  Information-theoretic  based  procedures  (Senders,  1970  and  Baty,  1971). 

2.2  Secondary  Task.  Most  behavioral  research  approaches  to  estimating  spare  mental  capacity  have  used 
secondary  task  procedures.  This  approach  provides  the  human  operator  with  an  additional  (secondary)  task 
to  be  performed  only  when  the  main  (primary)  task  has  been  fully  attended.  Performance  on  the  secondary 
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task  theoretically  decreases  as  the  attentlonal  demand  of  the  primary  task  Increases.  Secondary  task 
performance,  then,  bar. ones  an  indirect  measure  of  operator  workload. 

Choice  of  the  secondary  task  aDd  procedures  used  to  administer  it  become  central  issues  in  considering 
this  method  of  worklord  assessment.  Knowles  (1963),  for  example,  states  that  a  viable  secondary  task  for 
workload  assessment  should  not  physically  Interfere  with  the  primary  task,  require  little  of  scoring. 
Detailed  reviews  of  the  extensive  literature  on  secondary  tasks  are  provided  by  Rolfe  (1973b)  and  Levine, 
Ogden  and  Eisner  (1978). 

2.3  Occlusion.  In  many  cases  where  workload  is  to  be  estimated  the  primary  information  input  to  the 
operator  is  visual.  The  occlusion  method  of  workload  estimation  can  be  used  in  such  cases  (Senders,  et 
al.,  1967). 

Occlusion  is  a  time-shir lng  technique  and  as  such  is  similar  to  the  secondary  task  method.  However , 
in  occlusion  the  time-sharing  is  accomplished  by  suppressing  information  inputs;  that  is,  by  giving  the 
operator  time  samples  of  visual  information.  Examples  of  automobile  driver  research  using  this  technique 
are  found  in  Farber  and  Gallagher  (1972)  and  Hicks  and  Hierwllle  (in  press). 

3.  PRIMARY  TASK  MEASURES 

It  can  be  hypothesized  that  as  the  mental  workload  of  a  human  operator  increases,  the  performance 
of  that  operator  may  change,  ordinarily  iu  the  direction  of  degradation.  If  such  a  change  does  in  fact 
occur,  its  measurement  would  be  an  indication  of  Increased  workload.  This  hypothesis  underlies  the 
primary  task  performance  method  of  assessing  workload. 

The  use  of  primary  task  measures  as  a  means  of  assessing  workload  was  not  particularly  popular 
during  the  1960's  and  early  1970's,  because  initial  indications  were  that  operators  adapt  to  chauglng 
conditions,  thereby  holding  performance  constant.  As  Cooper  and  Harper  (1969)  put  it,  "In  a  specific 
task,  he  (the  pilot)  is  capable  of  attaining  essentially  the  same  performance  for  a  wide  range  of  vehicle 
characteristics,  at  the  expense  of  significant  reductions  in  tils  capacity  to  assume  other  duties.  .  ." 

In  this  case  they  were  referring  to  measures  such  as  glide-slope  error  or  flight  path  error  in  turbulence. 

A  somewhat  more  detailed  examination  of  performance,  however,  might  provide  an  indication  of  changes. 

As  a  task  becomes  more  difficult,  an  operator  may  summon  more  effort,  thereby  holding  performance  in  a 
specific  variable  or  set  of  variables  constant.  However,  to  maintain  this  performance,  the  operator  may 
have  to  modify  his  strategy..  By  examining  measures  other  than  those  involving  Bystem  output,  it  may  be 
possible  to  detect  this  shift  in  strategy  and  thereby  obtain  a  measure  of  workload. 

Another  concept  in  primary  task  measures  was  recently  put  forth  by  Albanese  (1977).  He  suggests 

that  "successful  mission  completion"  is  a  measure  of  workload.  In  this  case,  if  an  operator  is  able  to 

complete  a  mission  successfully,  there  is  no  overload.  On  the  other  hand,  if  the  operator  cannot  success¬ 
fully  complete  the  mission,  an  overload  is  presumed  to  have  occurred.  This  rather  broad  concept  has 
distinct  merit  if  an  investigator  is  most  concerned  about  the  overload /nonoverlocd  dichotomy.  Primary 
task  measures  properly  chosen,  will  Indeed  make  assessment  of  mission  success  possible.  Measures  such 
as  landing  touch-down  performance,  aiming  performance,  seeker  lock-on,  and  number  of  procedural  blunders, 
can  be  used.  Successful  mission  completion  must  be  defined  in  terms  of  the  measures. 

3.1  Single  Measures  (Primary  Task).  A  very  large  number  of  workload  studies  (Murphy,  et  al.  1974;  Price, 

1975;  and  Kickens  and  Kassel,  1977)  have  involved  the  use  of  one  or  more  primary  task  measures,  indi¬ 
vidually  on  performance  or  as  a  precaution,  while  main  interest  was  on  some  other  method  of  assessing 
workload  (Kalsbeek  and  Sykes,  1967  and  Trumbo,  et  al.  1967).  In  a  few  cases,  the  primary  measures  have 
been  taken  specifically  as  a  means  of  investigating  level  of  workload  (Brictson,  1974  a,  b) . 

3.2  Multiple  Measures  (Primary  Task).  When  a  human  operator  performs  a  task  in  an  actual  system,  several 
subtasks  are  ordinarily  involved.  In  such  casus,  a  single  measure  of  system  performance,  such  as  error, 
may  be  Inadequate.  Considerations  such  as  stores  usage,  accelerations  experienced,  anJ  operator  percep¬ 
tual  style  and  strategy  uay  become  Important.  In  other  situations,  it  may  be  found  that  single  measures 
of  the  primary  task  do  not  exhibit  adequate  sensitivity  to  operator  workload,  because  of  operator 
adaptivity.  In  cases  such  as  these,  multiple  measures  of  primary  task  variables  might  be  considered  for 
workload  assessment.  Essentially,  the  use  of  multiple  measures  provides  a  more  complete  picture  of  operator 
behavior  and  operator/system  performance. 

To  obtain  the  maximum  information,  the  imltlple  measures  should  first  be  subjected  to  a  combined 
analysis  and  then  subsequently  to  individual  analysis  where  appropriate.  Techniques  that  can  be  used  for 
the  combined  analysis  include  multiple-regression  analysis,  correlation  analysis,  and  various  multivariate 
analyses.  These  techniques  provide  a  sound  methodological  approach  for  drawing  valid  conclusions  regard¬ 
ing  system  performance  and  workload. 

Ordinarily,  when  using  multiple  measures,  the  additional  measures  used  are  not  simply  a  greater 
number  of  those  used  in  single  measure  analysis.  Measures  such  ss  RMS  accelerations,  number  of  control 
(stick)  reversals,  dominant  spectral  frequencies,  and  control  surface  zero  crossings  are  typical  of  the 
added  measures  (Kreifeldt,  et  al.  1976).  Usually,  measures  such  as  these  are  intended  to  reflect  strategy 
changes  instead  of  performance  scores,  because  performance  scores  may  not  change  at  lower  operator  workloads 

In  several  cases  multiple  measures  have  been  taken  which  combine  several  totally  different  workload 
assessment  techniques.  Primary  task  measures  may  be  combined  with  any  of  the  other  methods:  opinion, 
spare  mental  capacity,  and  physiological  measures  (Clement,  1976;  O'Donnell  and  Splcuzza,  1975;  and 
Simmons  al.  1976).  The  fact  that  the  units  of  these  measures  differ  does  not  present  a  problem  in 
the  analysis.  The  scores  can  be  normalized  or  similarly  treated  in  the  analysis.  In  fact,  a  few  studies 
have  been  performed  with  the  purpose  of  determining  which  of  several  different  workload  techniques  is 
most  sensitive.  (See  Wierwille  and  Hlcke,  in  press.) 
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3.3  Mathematical  Mod sling .  Mathematical  model  lug  of  human  operators  In  sysiaas  has  long  been  an  area  of 
substantial  Interest  to  researchers.  Interest  began  In  the  area  of  tracking  and  manual  control  system. 
Subsequently!  It  has  branched  Into  areas  of  human  operator  decision  processes,  supervisory  processes, 
and  team  Interactions. 

Recently,  a  few  of  the  researchers  (Jex  and  Allen,  1970a;  Baron  and  Levlson,  1975;  Hewerlnke,  1976 
and  1977;  and  Wlckens  and  Gopher,  1977)  involved  in  modeling  have  begun  to  examine  the  problem  of  operator 
workload.  This  has  usually  been  done  as  an  attendant  examination,  with  prime  Interest  being  in  model 
stimulus-response  accuracy  (Phatak,  1973;  and  Watson,  1972). 

Other  recent  studies  have  departed  from  the  describing  function  and  optimal  control  models.  Onatott 
and  Faulkner  (1977)  (also  Faulkner  and  Onstott,  1977)  worked  with  an  urgency  model  of  attention  allocation. 
Rouse  (1977b)  employed  queuing  theory  to  study  human  interaction  with  computers.  Also  Kavon  and  Gopher 
(1977)  postulate  a  model  based  on  resource  allocation.  These  models  all  have  some  bearing  on  workload; 
however,  results  are  preliminary. 

4.  PHYSIOLOGICAL  MEASURES 

One  of  the  most  widely  researched  methods  of  assessing  operator  workload  Is  the  use  of  physiological 
measures.  The  physiological  method  generally  Involves  the  measurement  and  data  processing  of  one  or  more 
variables  related  to  human  physiological  processes.  The  underlying  concept  In  physiological  monitoring 
Is  as  follows: 

As  operator  workload  changes,  involuntary  changes  take  place  In  the  physio¬ 
logical  processes  of  the  human  body  (body  chemistry,  nervous  system  activity, 
circulatory  or  respiratory  activity,  etc.).  Consequently,  workload  may  be 
assessed  by  the  measurement  and  processing  of  the  appropriate  physiological 
variables . 

In  many  cases,  there  it  an  underlying  assumption  that  high  workload  levels  are  accompanied  by 
Increased  emotional  stress.  This  stress  Is  then  measured  by  physiological  recording  and  is  related  back 
to  workload.  Stress  In  this  case  Is  assumed  to  act  as  an  intermediate  variable,  causing  physiological 
changes. 

In  other  cases,  the  underlying  assumption  involves  changes  In  the  state  of  "arousal."  Arousal  may 
be  considered  as  a  state  of  preparedness  of  the  body  of  level  of  activation  of  the  human  organism. 

Roughly,  one  may  think  of  arousal  as  the  state  of  excitedness.  Here  again,  the  assumption  is  that  mental 
workload  changes  are  accompanied  by  changes  In  arousal  level  that  can  be  measured  by  appropriate  physio¬ 
logical  monitoring  equipment. 

It  is  worth  mentioning  that  physiological  measures  of  workload  do  not  require  the  underlying  assump¬ 
tion  that  the  human  operator  is  a  single— ciiannel  sampling  device.  Instead,  a  rather  global  deflnltlt  A 
workload  may  be  assumed,  in  which  mental  workload  is  considered  a  conglomerate  of  behaviors,  similar  -o 
those  enumerated  by  Berliner,  et  al.  (1964). 

' , 1  Single  Physiological  Measures .  The  majority  of  work  on  physiological  monitoring  for  the  sake  of 
?  sessing  workload  has  been  performed  using  single  measures.  In  several  cases  data  on  more  than  one 
a^asure  have  been  taken  In  a  given  experiment,  but  each  measure  lias  then  been  analyzed  Individually. 

Such  measures  are  considered  here  as  single  measures.  Although  perhaps  not  stated  explicitly  by  the 
Investigators,  the  objective  of  these  studies  has  been  to  find  a  single  physiological  measure  that 
accurately  and  reliably  reflects  changes  In  operator  mental  workload. 

In  dealing  with  single  measures  (or  any  physiological  measures  for  that  matter).  It  must  be  recognized 
that  operator  behavior  other  than  mental  workload  may  have  an  effect  on  the  physiological  measures. 

Physical  exertion,  for  example,  may  affect  the  measures  being  taken.  Consequently,  the  range  of  potential 
applications  of  a  measure  may  be  severely  limited  by  the  confounding  effect  of  operator  behavior  In  areas 
other  than  mental  workload.  In  specific  terms,  a  measure  that  varies  with  physical  work  as  well  as 
mental  work  for  example  can  only  be  used  If  physical  work  Is  held  constant  or  Its  manifestations  on  the 
measure  ere  known  and  taken  Into  account. 

A.  review  of  each  of  the  physiological  measures  as  shown  In  Table  2  Is  beyond  the  scope  of  this  paper. 
However,  the  reader  la  referred  to  a  discussion  of  combined  physiological  measures  in  the  following 
section  4.2  and  Wierwille  and  Hllllges  (1978).  ' 

4.2  Combined  Physiological  Measures.  Certain  Investigators  have  taken  the  point  of  view  vnat  single 
physiological  measures  may  not  provide  adequate  predictive  Information  to  allow  assessment  of  workload. 

They  then  proceed  to  analyze  multiple  physiological  measures  in  a  combined  analysis  in  an  effort  to 
better  assess  and  predict  workload.  The  multiple  physiological  measurement  philosophy  is  the  same 
approach  taken  by  researchers  as  was  discussed  In  Section  3.2  for  multiple  primary  teak  measures. 

As  with  primary  task  measures,  a  cowon  class  of  techniques  can  be  applied.  These  include  multiple- 
regression  analysis,  correlation  analysis,  and  multivariate  analysis.  The  purpose  in  using  inese 
statistical  techniques  Is  to  provide  the  best  prediction  and  discrimination  of  workload  levels,  based  on 
the  physiological  measures  at  hand. 

Several  reports  and  papers  have  been  published  describing  multiple  feature  extraction  techniques 
applied  to  multiple  physiological  measures  (Spyker,  et  al.  1971;  Stachouse,  1973;  and  Stackhouse,  1978). 

The  technique  used  is  one  of  selecting  a  number  of  features  for  each  physiological  measure  and  than 
performing  a  multiple  regression.  The  best  weighting  of  the  most  highly  correlated  features  is  then 
used  in  the  prediction  equation. 
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Storm,  «t  *1.  (1976)  have  performed  analysis  of  multiple  compounds  In  the  urine  believed  correlated 
with  various  aspects  of  aircrew-member  stress.  In  general,  while  statistical  analyses  were  not  performed, 
great  care  was  taken  in  analyzing  from  a  diagnostic  point  of  view  the  directions  and  magnitudes  of  changes 
in  the  levels  of  the  compounds.  Moreover,  interactions  were  studied.  A  study  of  this  type  as  well  as 
Storm  and  Hapenney  (1976)  gives  a  general  impassion  of  the  physiological  changes  that  occur  when  Air 
Force  aviators  unuergo  high  workload/stress  conditions  for  extended  periods  of  time.  Brictson,  at  al. 
(1974)  and  McHugh,  et  al.,  (1974)  studied  the  effects  of  high  workload  conditions  on  the  performance  of 
naval  aviators  in  high-performance  aircraft.  The  approach  taken  was  one  which  combined  stepwise  multiple 
regression  of  physiological,  psychiatric,  and  performance  measures  in  carrier  landings. 

The  physiological  measures  In  these  studies  on  naval  aviators  were  primarily  those  taken  from  blood 
samples  and  included  serum  cholestrol,  serum  uric  acid,  blood  lactate,  and  pyruvate.  Changes  in  the 
levels  of  biochemical  measures  were  analyzed  as  a  function  of  alterations  in  levels  of  workload,  sleep, 
performance:  and  mood. 

4.3  Speech  Pattern  Analysis.  Recently,  there  have  been  indications  that  inaudible  changes  taka  place  in 
speech  when  an  individual  is  under  stress.  These  changes  generally  are  not  detectable  by  an  unaided 
listener  but  can  be  elicited  with  the  proper  equipment,  e.g..  Psychological  Stress  Evaluator  (PSE).  The 
underlying  theory  of  the  PSE  has  to  ao  with  prestnce  or  absence  of  physiological  tremor  of  micro-tremor 
in  the  human  voice.  In  general  this  micro-tremor  is  present  In  an  individual  who  is  not  under  stress. 

The  tremor  results  in  a  frequency  modulation  effect  of  certain  voice  sounds  that  is  only  detectable  with 
equipment.  The  trmsor  and  frequency  modulation  of  the  voice  become  suppressed  when  an  individual  is 
under  stress,  such  as  when  attempting  to  deceive  lav  enforcement  personnel  (Krads,  1974,  and  Dahm,  1974). 

Older  and  Jenney  (1975)  analyzed  voice  communications  of  Skylab  astronauts  as  a  means  of  determining 
situational  stress.  The  scores  obtained  using  a  PSE  were  correlated  with  operational  variables  known  to 
represent  varying  degrees  of  stress.  They  found  some  statistically  significant  relationships,  but 
concluded  that  PSE  usage  was  not  sufficiently  predictive  of  ulld  stress  as  to  warrant  use  in  future 
missions. 

Simonov  and  Frolov  (1977),  following  the  work  of  Older  and  Jenney,  undertook  to  determine  the  emo¬ 
tional  state  of  cosmonauts  and  others  via  voice  analysis.  They  indicated  that  the  problem  appears  very 
complex  and  that  substantial  further  work  is  required. 

Harris,  et  al  (1977),  taking  a  somewhat  different  approach,  using  automatic  voice  recognition  and 
synthesis  equipment,  showed  that  u  verbal  arithmetic  task  produced  less  decrement  in  concurrent  manual 
tracking  than  did  a  keyboard  arithmetic  task.  They  point  out  that  automatic  voice  recognition  equipment 
Introduces  an  additional  source  of  error  that  may  be  dependent  on  task  difficulty. 

It  seems  clear  that  extreme  stress  can  be  measured  by  voice  analysis.  At  this  time,  however,  the 
usefulness  of  voice  analysis  for  either  mild  stress  or  mental  workload  is  unclear.  Several  investigators 
appear  on  the  verge  of  analysis  of  voice  in  regard  to  workload,  but  results  are  not  presently  available. 

CONCLUSIONS 

This  survey  of  the  workload  literature  has  shown  that  several  approaches  are  potentially  useful  for 
the  aircrew  workload  problem,  but  no  one  single  technique  can  be  recomaended  as  the  definitive  measure 
of  operator  workload.  Because  of  the  multidimensionality  of  workload,  it  also  appears  unlikely  that  any 
one  single  measure  will  ever  suffice  completely.  Consequently,  multiple  measures  including  the  dimensions 
of  subjective  opinions,  spare  mental  capacity,  primary  tasks,  and  physiological  correlates  should  be 
considered.  The  classification  'cherae  and  applicability  matrix  developed  in  this  paper  should  provide 
the  investigator  with  an  aid  fo.  e  loosing  among  the  preaently  available  techniques. 

RECOMMENDATIONS 

This  study  of  the  workload  literature  has  provided  support  for  several  recommendations.  Including 
Implications  for  future  work.  Four  of  the  most  prominent  research  recommendations  are  presented  in 
brief  j  a,... 


This  study  of  the  workload  literature  has  been  performed  in  a  way  that  will  allow  computerizing  of 
the  information.  The  advantages  of  computerizing  would  be  numerous.  A  user  would  be  guided  through 
Important  citations  based  on  the  needs  associated  with  a  given  aircrew  workload  eatlmatlon  problem. 

More  specifically,  relevant  references  could  be  cross-filed  according  to: 

1.  The  workload  classification  scheme. 

2.  Keyword  or  combinations  of  keywords  (in  title  or  in  abstract). 

3.  Author  or  authors. 

4.  Workload  category  or  subcategory. 

If  requested  by  the  user,  the  system  would  also  provide  a  narrative  summary  on  the  N  workload 
techniques,  to  provide  broad  necessary  background  should  the  user  not  already  have  it. 


The  twe  major  advantages  of  subjective  opinion  ratings  sre  acceptance  and  lack  of  intrusion.  Pilot 
acceptance  ox  opinion  ratings  has  been  good  and  is  well  documented  in  the  handling  quality  domain. 
Opinion  ratings  are  generally  not  Intrusive.  However,  with  the  exception  of  the  conjoint  measurement 
technique,  moat  previous  approaches  have  failed  to  follow  rigorous  psychometric  procedures  in  developing 
workload  rating  scales.  Additionally,  several  other  limitations  of  ratings  also  need  to  be  considered. 
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Adaptivity  of  the  pilot,  for  utopia,  repraaanta  a  aerloua  problem.  Due  to  adaptivity,  rating*  aay  be 
either  too  high  or  too  low.  A  system  that  initially  provides  the  lap  rasa  ion  of  being  av'mrard  to  use  say 
obtain  highar  ratings  than  it  should,  because  the  crew  member  adapts.  Other  probleas  include  possible 
eaotlonal  stats,  experience,  and  learning. 

Given  the  wldeapread  use  and  general  applicability  of  rating  scales  as  a  technique  of  workload 
assssaaent,  it  is  surprising  that  a  rigorous  workload  rating  scale  has  not  been  developed.  Research  is 
needed  to  determine  the  underlying  scaling  dlaenalons  of  aental  workload  and  to  develop  an  interval  type 
metric  characteristic  of  the  conjoint  aeaaureaent  procedure.  Recent  approaches  such  as  behavlorally 
anchored  response  scales  (BARS)  aay  be  useful  in  this  regard.  Objective  anchor  points  such  as  seaantlc 
differential  such  «s  policy  capturing  Bight  be  applicable  in  deteralning  the  relative  laportance  of 
various  dimensions  used  in  subjective  estlaates  of  workload.  Research  is  also  needed  to  compare  the 
utility  of  these  various  rating  procedures  and  to  specify  the  reliability  and  validity  of  the  resulting 
scales. 


erlson  of  Methods  of  Workload  Estimation 


This  literature  review  has  shown  that  little  work  has  been  done  on  experimental  comparison  of  work¬ 
load  estimation  aetheds.  To  a  great  extent  each  research  group  in  the  workload  eatlaatlon  area  tends  to 
advocate  usually  one  or  possibly  two  workload  eatlaatlon  techniques.  One  group  advocates  time-estimation, 
another  critical  tracking  tasks,  and  still  others  specific  kinds  of  physiological  measures.  While  all 
this  work  la  clearly  important,  particularly  In  regard  to  development,  evaluation,  and  optimisation  of 
various  techniques  unanswered. 


Hicks  and  Wlerwille  (in  press)  have  recently  addressed  this  problem  on  sn  initial  basis.  They 
compared  five  different  (specific)  workload  techniques  in  a  moving-base  driving  simulator.  Included  in 
their  comparison  were  rating  scales,  primary  task  measures,  secondary  task  measures,  occlusion,  and  heart 
rate  variability.  It  was  found  that  large  differences  in  technique  sensitivity  existed  when  operator 
loading  was  adjusted  under  controlled  conditions.  Sensitivity  in  this  context  is  defined  se  the 
statistically  rlgnlflcant  differences  in  operator  loading.  High  sensitivity  low  variance  of  the  scores 
about  the  mesne.  Iu  addition,  it  was  determined  that  the  degree  of  Intrusion  varied  with  the  technique, 
with  some  being  uonintrueive  while  others  ware  highly  intrusive. 


A  similar,  more  complete  study  needs  to  be  performed  for  the  aircrew  workload  estimation  problem. 

At  present,  the  comparative  sensitivity  of  aircrew  workload  estimation  techniques  la  unknown.  Because 
sensitivity  has  generally  not  been  high,  such  a  study  is  vital.  Selection  of  a  technique  without 
comparative  information  may  yield  results  indicating  that  there  Is  no  change  in  aircrew  workload  for  two 
or  mors  different  configurations  when  in  fact  there  is  a  change.  And,  since  an  aircrew  member's  work¬ 
load  may  already  be  high,  failure  to  discriminate  workload  differences  in  a  T&E  situation  may  later 
jeopardise  mission  success. 


Workload  evaluation  is  at  present  a  highly  active  research  area.  It  is  estimated  that  more  than 
one  hundred  researchers  in  the  United  States,  Europe,  and  elsewhere  are  immersed  in  workload  research 
at  this  time.  Because  of  the  forthcoming  results  and  the  extreme  diversity  of  this  work,  the  workload 
search  described  here  will  need  to  be  updated  peridodally  if  it  la  to  remain  current. 


The  updating  of  the  search  is  vary  important  since  much  of  the  work  presently  in  progress  has  direct 
bearing  on  the  aircrew  workload  problem.  More  epecif ically,  while  much  of  the  earlier  research  on  work¬ 
load  waa  of  an  exploratory  nature  or  Involved  development  of  concepts  and  constructs,  more  recent  work 
has  tended  toward  .he  practical  with  applications  to  aircraft  and  other  human-operator  systems  problems. 
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CluiKlcuion  of  Universal  Operator  Behavior  Dimension 
(After  Berliner,  Angell,  end  Shearer,  1964) 
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x.  Subjective  Opinion 
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WORKLOAD  ASSESSMENT  METHODOLOGY  DEVELOPMENT 


by 

Billy  M.  Crawford 
Syataaa  Reaearch  Branch 
Human  Engineering  Division 
Wright-Patteraon  APB  OH  45433 


The  Workload  Problem 

During  the  development  of  advanced  aan-aachlne  systems  a  nuaiber  of  important  questions  aust  be  re¬ 
solved.  Many  of  then  relate  to  huaan  performance  or  wanning  requlreaenta.  for  exaaple:  How  much  atten¬ 
tion  Is  required  by  operator  tanks?  Which  tasks  can  be  assigned  to  a  single  operator?  How  long  can  an 
operator  perform  his  task  effectively  without  a  rest  break'’  How  much  learning  or  training  Is  necessary? 
What  la  the  minimum  crew  size  for  a  system?  How  will  time  t..c.iure  and  other  stresses  affect  task  and 
ultimately  mission  performance?  All  the  foregoing  questions  relate  to  performance  and  workload. 

The  designer/planner,  based  on  his  appraisal  of  the  possible  contingencies,  typically  attests  to 
minimize  the  frequency,  extent  and  seriousness  of  work  overload  situations.  However,  he  can  neither 
personally  nor  vicariously,  through  others,  rigorously  assess  the  workload,  or  potential  workload  without 
a  standard  metric  for  adequately  defining  and  quantifying  it.  Even  If  he  does  Identify  particular  periods 
of  potential  workload  excess,  he  dees  not,  except  in  extreme  and  obvious  cases,  have  quantitative  infor- 
matlon  to  assist  in  deciding  which  of  the  instances  are  the  most  critical  and  demanding  and  bence  should, 
within  resources  and  technological  limitations,  be  given  priority  consideration  In  design.  Nor  does  he 
have  a  criterion  by  which  he  can  decide  and  demonstrate  that  the  problea  has  been  reasonably  resolved. 

In  the  development  laboratories,  alternative  proposed  designs  or  arrangeaents,  or  alternative  proce¬ 
dures,  may  be  compared  on  the  basis  of  spaed,  accuracy,  or  errors.  However  operationally  significant 
differences  may  not  be  revealed  simply  because  the  subjects  are  able  to,  and  do,  aaster  their  resources 
("try  harder")  and  thus  compensate  for  what  would  otherwise  be  real  differences. 

Work  overload  at  the  mental  or  "cognitive"  level  has  been  associated  with  Increases  in  the  United 
States  Air  Force  aircraft  accident  rate  (Miholick,  1978).  For  exaaple,  during  1977  and  1978  "channelized 
attention"  or  "distraction"  were  factors  In  16  accidents  involving  the  loss  of  12  aircraft,  9  fatalities, 
and  a  dollar  loss  of  over  81  million  dollars.  "Task  saturation"  which  results  In  Intense  concentration 
on  the  task  perceived  to  be  moat  important  at  the  expense  of  other  critical  performance  requirements  was 
classified  as  "channelized  attention."  "Distraction"  was  used  to  refer  to  occasions  In  which  an  unexcepted 
task  causes  attention  to  be  diverted  to  coping  with  the  cause  of  the  unscheduled  task  load. 

If  we  are  to  make  safe,  economical  use  of  huaan  and  material  resources  It  la  necessary  to  determine 
efficient  crew  compositions,  appropriate  assignments  of  duties  and  responsibilities  to  crew  aeabers,  and 
effective  allocations  of  functions  and  tasks  among  men,  machines  and  computers  (Including  software).  In 
addition.  It  Is  necessary  to  identify  the  critical  periods  In  a  task  or  mission  during  which  the  operator's 
performance  is  particularly  prone  to  degradation  or  failure  because  of  work-overload  stress.  Further,  it 
is  necessary  to  provide  improved,  valid  and  quantitative  methods  for  assessing  equipment  and  systea  design, 
and  procedural  alternatives;  and  for  mission  planning  and  survlvabllity/vulnerablllty  analyses,  to  locate 
and  quantitatively  define  the  most  critical  aud  demanding  task  segments.  In  a  parallel  view.  It  Is 
necessary  to  identify  and  quantitatively  define  those  periods.  If  any,  of  sub-optimal  workload  stress  so 
that  the  resources  can  be  used  elsewhere,  or  so  that  provisions  can  be  aade  to  preclude  or  alleviate 
boredom,  loss  of  "sharpness"  or  alertness,  etc.,  the  effects  of  which  can  carry  over  to  and  jeopardize 
performance  In  subsequent  tasks  or  mission  periods.  Due  to  the  rapid  advances  in  coaputer  technology 
and  the  more  centralized  role  computers  assume  in  advanced  systems ,  emphasis  probably  should  be  on  man- 
computer  Interactions  and  Information  processing /dec is ion-making  functions  which  are  not  adequately 
accounted  for  by  conventional  humnn  performance  metrics,  task  analysis,  t lae-and-mo t Ion ,  and  time-line 
methods. 

The  principal  objectives  of  a  supportive  workload  research  and  development  program  should  be  (1) 
establishment  of  a  set  of  theoretically-conslstent  component  functions  descriptive  of  the  performance 
of  crew  members  in  relevant  system  tasks;  (2)  development  of  quantitative  (mathematical)  expressions  of 
relationships  between  input-output  parameters  for  the  component  functions  and  appropriate  combinations 
thereof;  (3)  Integration  of  the  results  of  (1)  and  (2)  above  into  a  task  anslytlc/computer  modeling 
methodology;  and  (4)  validation  of  the  analytic/predictive  methodology  In  a  system  design,  development 
and  test  effort.  Examples  of  approaches  and  metods  contributing  to  achievement  of  the  above  objectives 
follow. 

Adoption  of  a  Workload  Concept 

Ryan  (1947)  addressed  the  problea  of  measuring  the  cost  of  sedentary,  or  "mental,"  work  some  30  years 
ago  in  his  text  on  the  psychology  of  production.  Bis  concept  of  effort,  presented  in  the  saae  context.  Is 
similar  to  the  concept  of  workload  as  it  is  used  today.  Ryan  identified  four  possible  meanings  for  effort: 

(1)  energy  consumption, 

(2)  cost  of  work  (e.g.,  fatigue,  loss  of  health,  dissatisfaction,  etc.), 

(3)  aspects  of  psychological  functioning  which  describe  the  "experience  of  the  worker  as  he  performs 
his  job,  and 

(4)  the  rate  of  performance  of  an  individual  in  relation  to  the  maximum  possible  rate  of  performance 
under  the  given  conditions. 
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Ryan  Indicated  hia  preference  for  the  letter  (fourth)  Martins  of  effort  probably  becauee  it  required 
that  teak  performance  be  not  only  ueasured  but  alao  related  to  the  capacity  of  the  worker.  The  dlacuaalon 
of  topica  which  enauee  la  baaed  upon  the  assumption  that  effective  reaolutlon  of  workload  probleM  dependa 
upon  the  capability  to  Maaure,  by  a  coanton  metric ,  both  teak  demande  deriving  from  work  altuatlone  and  the 
Inherent  capabilltiea  of  the  worker  to  Met  them. 

Perfonunce  Theory  DevelopMnt  and  Application 

in  order  to  progress  in  an  orderly,  aystaMtic  manner,  it  ia  neceaaary  to  explain  and  relate  per¬ 
tinent  facts  in  a  logically  consistent  manner .  Current  human  perforMnce  theory  can  serve  that  function 
in  a  workload  aasessMnt  program.  The  primary  goal  of  human  performance  theory  is  to  analyse  twin 
capabilities  in  a  manner  which  will  permit  (1)  identification  and  description  of  basic,  component  func¬ 
tions  and  (2)  quantification  of  tha  limits  of  capacity  in  aach  component  function.  Theorlee  which  treat 
tha  human  aa  princ'i-l.  reformation  procaaaor  of  limited  capacity  appear  to  be  most  appropriate. 

Som  of  the  rese hich  have  been  associated  with  the  developMnt  of  such  a  theory  are 
revealed  by  the  fo;)'vit>£  'tv- pea"  of  theories: 

(1)  Sf>;£?e  Channt i  Theory  (Welford,  1952;  Broadbent,  1958).  The  himrn  is  strictly  a  "aerial" 
pro lessor. 

(2)  Undifferentiated  Capacity  Theory  (Horay,  1967;  Kahnaman,  1973).  The  Inman  behavee  much  like 

a  time-sharing  computer  with  task  interference  strictly  a  function  of  total  deMnd  rate  rather 
than  specific  to  the  nature  of  the  processing  tasks  competing  for  capacity. 

(3)  Limited  Capacity  Central  Mechanism  Theory  (Posner  and  Keele,  1970).  Som,  but  not  all,  pro¬ 
cesses  require  tha  "central  mechanism";  hence,  parallel,  aa  opposed  to  serial  (Single  Channel), 
processing  la  some  tins  a  possible. 

An  example  of  current  theorising  baaed  largely  on  the  single  channel  concept  is  that  of  W.  H. 

Teichner.  For  the  past  several  years  various  U.S.  Government  agencies  sponsored  efforts  of  Teichner 
to  develop  a  general  theory  of  buun  performance.  The  goal  was  a  systematic  approach  to  prediction  of 
human  peformance  as  a  function  of  task  variables  and  environmental  factors  (Teichner  and  Olson,  1971) . 
Teichner  drew  heavily  upon  the  available  experimental  psychology  and  physiology  literature  to  identify 
empirical  relationships  and  develop  models  of  simple  taaka  which  could  be  combined  into  a  more  compre¬ 
hensive  model  or  theory  or  used  iu  predicting  perforMnce  in  more  complex  tasks  (Teichner,  1974) . 

Based  on  observations  of  people  engaged  in  a  large  variety  of  work  situations,  Teichner  concluded 
that  the  same  general  functions  comprise  the  various  human  activities  involved;  hence,  the  feasibility 
of  modeling  any  human  activity  in  tens  of  a  finite  aet  of  generic  subtasks.  Teichner  and  Olson  held 
that  tasks  always  involve  a  transfer  of  intonation  from  an  initial  input  to  a  final  output.  In  other 
words,  the  human  is  a  system  which  functions  through  a  series  of  communication  links  and  subtasks  and 
that  system  is  the  same  whether  flying  an  airplane  or  dialing  a  telephone.  Ho  natter  how  the  man-machine 
system  context  varies,  at  a  given  level  of  human  system  analysis  the  only  differences  will  be  in  the 
activity  or  degree  of  loading  of  the  subtasks. 

Teichner' s  theoretical  approach  is  consistent  with  husan  engineering  and  system  analysis  tradition 
in  referring  to  mo  and  machine  as  components  of  man-machine  systems.  Any  operation  on  inforMtlon 
within  a  component,  whether  Mn  or  Mchine,  is  called  a  "process"  whereas  transfers  of  information 
between  components  are  called  "tasks." 

Although  both  the  maximum  complexity  and  maximum  capacity  of  the  husan  are  constant  according  to 
Telcbnerlan  theory,  system  capacity  may  be  varied  in  a  number  of  ways.  For  example,  since  operations 
My  be  perforMd  by  different  combinations  of  available  generic  subtasks,  it  My  be  possible  to  replace 
the  limiting  function  in  a  serial  process  with  a  higher  capacity  subtask.  Or,  the  system  My  be 
redesigned  for  parallel  processing  at  the  limiting  stage  by  allocating  the  function  to  another  component, 
e.g.,  a  Mchine  or  another  person.  Assuming  the  human  is  a  single  channel  system,  the  Mxlmum  processing 
rate  can  be  no  greater  than  the  capacity  of  the  lowest  capacity  stage  in  a  sequence,  of  course. 

In  developing  his  performance  theory,  Teichi er  bypassed  the  task  taxonomy  problem  and  went  directly 
to  empirical  relationships  and  principles  which  could  be  used  to  predict  dependent  Manures.  The  theory 
builds  upon  Dondera'  Law  which  is  based  on  data  obtained  from  attempts  to  Masure  the  physiological  time 
of  Mntal  processes  associated  with  discrimination  and  choice  in  1868  (Woodworth  and  Schlosberg,  1955) . 
Donders'  Law  simply  states  that  choice  reaction-time  (CRT)  is  composed  of  simple  reaction  tlM  (a  constant) 
stimulus  categorization  time,  and  response  selection  tlM. 

Teichner  initially  modified  Donders'  Law  as  follows:  (1)  Stimulus  identification  tlM  was  Included 
in  simple  reaction-time;  (2)  Stimulus  code-to-response  code  translation  time  (T„_r)  was  substituted  for 
the  response  selection  component;  (3)  Stimulus  code-to-stlmulus  code  translation  time  (Ts_a)  was  added  to 
account  for  tasks  in  which  it  was  necessary  to  transform  one  stimulus  cede  to  another  before  selecting  a 
response;  and  (4)  another  component  (c)  was  added  to  cover  time  required  to  select  the  motor  program  for 
executing  the  response.  The  resulting  equation  waa: 

CRT  -  a  +  Tg_s  +  T,.r  +  c 

in  which  "a"  includes  both  stimulus  encoding  time  and  neural  transmission  tlM. 

Teichner  adopted  a  response  criterion  model  proposed  by  McGill  (1963)  and  Grice  (1968)  in  order  to 
account  for  empirical  evidence  that  the  "a  component"  of  the  ebore  equation  depends  upon  stimulus 
intensity  and  duration  (Teichner  and  Krebs,  1972). 
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Taichnar  propoaad  Co  usa  coding  chaory  and  Information  matrlca  to  quantify  S-S  translation.  Two 
examples  of  S-S  tranalatlon  ara  compraaalon  and  claaalflcatlon.  Compression  Is  exemplified  as  follows: 
Assusm  a  four  massage,  binary  source  encoded  thus:  0001,  0010,  0100,  and  1000  with  equal  probabilities 
of  occurrence.  Compression  could  be  achieved  by  recoding,  e.g.,  00,  01,  10,  11,  with  no  change  in 
massage  probability.  The  average  value  of  the  original,  or  source  code  (L,),  Is  4  bits  per  message  as 
compared  to  2  bits  per  message  for  the  recoded  messages  (I*.).  In  coding  theory,  the  average  compression 
for  a  sequence  of  symbols  is  called  the  compression  coefficient  and  la  represented  by  the  equation: 
m  -  Lc/Lg.  The  value  of  stimulus  compression  is  in  Its  effect  on  the  S-R  translation  process.  Because 
there  Is  less  Information  in  the  compressed  lMssaga,  the  S-S  translation  should  Involve  less  time  and 
error.  Obviously  the  loss  resulting  from  the  compression  process  must  be  less  than  the  gain  at  the  S-R 
stage  for  it  to  be  worthwhile. 

The  second  form  of  S-S  translation  Identified  by  Teichner  Is  classification  which  results  in  a 
reduction  in  tue  number  of  messages  S-S  classification  is  exemplified  as  follows:  Assixm  a  message  set 
of  four,  e.g.,  FI,  F2,  B1  and  92.  This  message  set  may  be  sorted  into  F  (fighter)  and  B  (bomber),  a  case 
of  four-to-two  mapping. 

It  can  be  seen  that  the  coat  effectiveness  of  S-S  translations  as  described  above  is  asseasible  In 
terms  of  changes  In  the  Information  transmission  rate  (R)  achieved  for  CRT  tasks.  The  cost  effective¬ 
ness  index  for  compression  (CEc)  is  CEc  *  R/m.  CEc  1*  the  rate  of  information  processing  per  unit  of 
compression.  The  cost  effectiveness  of  reduction  in  messages  through  classification  (CEr)  la  expressed 
ns  follows:  CEr  "  R/’H^/Hg  where  Hs  la  the  amount  of  information  In  the  original  message  set  and  H<.  Is 
the  aanstt  of  information  in  the  set  after  classification. 

The  same  cost  effectiveness  concepts  may  be  applied  to  the  S-R  translation  process.  In  which  case 
the  recoded  message  is  a  response  and  is  defined  by  a  response  code.  Again,  the  Impact  of  reduction  or 
compression  is  expected  to  be  greater  speed  and  accuracy  of  response  selection. 

Teichner  clearly  distinguishes  between  response  selection  and  response  execution.  It  is  assumed 
that  responses  are  always  defined  symbolically  by  response  codes.  Only  after  the  appropriate  response 
code  has  been  matched  with  the  stimulus  code  does  the  associated  motor  response  begin.  The  ensuing 
response  execution  may  entail  a  series  of  effector  selections  whether  the  response  modality  is  limb 
movement,  body  movement,  or  speech.  Execution  time  will  depend  on  factors  such  as  distance  travelled, 
amount  and  direction  of  force  exerted,  etc. 

Telchner'e  thinking  toward  a  complete  theory,  or  model,  of  performance  is  represented  by  the  flow 
diagram  in  Figure  1.  The  "a"  component  of  his  equation  for  CRT  derives  from  a  combination  of  sensory 
register  of  his  equation  for  CRT  derives  from  a  combination  of  sensory  register  and  scanner  functioning. 
The  flow  diagram  shows  that  the  response  criterion  applied  by  the  scanner  derives  from  long-term 
stimulus  memory  (LTM-S)  which  also  establishes  operating  levels  for  activating  systems  and  scanner  rate. 
LTM-S  also  provides  for  selective  tuning  of  the  register  so  that  thresholds  of  "energy  cells"  for  expected 
stimuli  will  be  lower  than  for  unexpected  stimuli.  Sensory  register  properties  are  derived  from  Hubei 
and  Wlesel  (1962) . 

Teichner  hypothesized  that  the  human  component  obtains  Information  transmission  rates  consistent 
with  system  demands  by  making  speed-accuracy  tradeoffs  In  the  following  manner.  At  the  input  stage, 
with  experience  at  a  task,  an  individual  learns  what  stimuli  to  expect,  how  much  stimulus  evidence  is 
required  to  respond,  what  sampling  rate  is  required,  and  makes  sensory  register/scanner  adjustments 
consistent  with  task  demands  relative  to  speed  and  accuracy.  Depending  upon  the  operations  Involved, 
a  range  of  speed-accuracy  variations  My  be  available  at  the  S-S  and  S-R  translation  stages.  And, 
finally,  at  the  output  stage  the  response  criterion  may  be  adjusted  upvard  or  downward  to  favor  either 
accuracy  or  speed  depending  upon  the  Information  transmission  demands  of  the  system. 

Habituation  is  handled  in  a  way  consistent  with  Sokolov' b  (1963)  neuronal  model.  When  a  novel 
stimulus  passes  to  the  S-S  translation  stage  and  cannot  be  matched  with  a  relevant  event  in  LTM-S,  the 
responsive  register  cell  Is  tuned  toward  increasingly  high  threshold  levels  on  successive  occasions. 

When  a  stimulus  event  is  detected  by  the  scanner  mechanism,  a  corresponding  unit  of  short  tern  memory 
(STM)  is  activated  for  a  duration  of  time  (e.g.,  30  seconds)  during  which  comparison  can  be  made  with 
I.TM-S  in  support  of  the  S-S  translation.  Teichner  suggests  that  several  available  models  are  consistent 
with  the  latter  process  (JJorman,  1970;  Saunders,  Smith  and  Teichner,  1974). 

The  importance  of  Telchner'e  theorizing  to  workload  assessment  rests  largely  in  Its  potential  impact 
on  task  analysis.  Traditional  human  engineering  task  analyses  provide  an  overwhelming  amount  of  detail 
almost  totally  unrelatable  to  available  theoretical  concepts  and  principles.  Part  of  the  difficulty  is 
attributable  to  the  fact  that  the  conceptual  frames  of  reference  tend  toward  anatomical  rather  than 
functional  task  descriptions.  Teichner' s  goal  was  to  systestatlze  the  description  of  operator  tasks  and 
performance  at  a  generic  level  consistent  with  both  the  environment/performance  literature  and  the 
operational  situation.  An  attempt  to  verify  the  applicability  of  a  portion  of  Teichner's  theory  for  a 
system  simulation  will  be  summarized  in  a  later  section. 

Theory  Testing  via  the  Divided  Attention  Paradigm 


Cats,  such  as  that  obtained  by  W.  E.  Hick  (1952),  relating  reaction  time  to  the  amount  of  information 
transmitted,  and  to  the  degree  of  stimulus-response  compatibility  (Garvey  and  Knowles,  1954),  caused  the 
ides  that  Independent  associative  links  exist  between  each  stimulus  and  response  to  be  replaced  by  the 
concept  of  a  mediating  limited  capacity  central  mechanism  (or  system) .  The  single  channel  Interpretation 
of  this  system  (Velford,  1952,  and  Sroadbent,  1958)  holds  that  a  signal  entering  the  system  dominates  the 
entire  channel  from  the  time  it  was  selected  until  the  response  is  initiated.  Any  other  contending 
signals  are  either  filtered  out  or  held  In  store  and  gated  into  the  channel  after  the  response  to  the 
previous  signal.  Increase  In  response  time  for  each  unit  of  information  transmitted  provided  measures 
of  the  processing  demands  a  signal  places  on  the  limited  capacity  system. 


However,  additional  research,  principally  task  Interference  studies,  suggested  i-  need  to  modify  the 
single  channel  concept.  While  the  single  channel,  or  aerial  processing.  Model  requires  that  the  tlae  to 
per f one  two  tasks  sltmltanaous.ly  should  equal  the  total  of  the  times  required  to  perform  each  teak  alone, 
sometimes  It  la  found  to  be  ouch  less  (Keels,  1967).  This  suggests  that  In  such  instances  some  components 
of  the  separate  tasks  may  be  processed  in  parallel  and,  hence,  do  not  require  exclusive  use  of  a  single 
channel  mechanism.  Attempts  to  account  for  this  apparent  parallel,  rather  than  aerial,  processing  led  to 
the  two  alternate  theories. 

One  of  the  alternatives  Is  the  general,  undifferentiated  capacity  theory  which  holds  that  Inter¬ 
ferences  between  tasks  occurs  only  whan  the  total  number  of  non-specific  "processing  units"  Is  exceeded 
by  the  demand.  That  la,  task  interference  la  not  specific  to  the  peculiar  nature  of  competing  task 
components,  or  operations,  Involved,  but  simply  reflects  an  "overdraw"  on  the  available  pool  of  capacity 
units.  Moray  (1967)  modified  this  Interpretation  somewhat  by  hypothesising  a  limited  capacity  processor, 
similar  to  a  time-sharing  computer,  which  allocates  from  Its  undifferentiated  processing  capacity  amounts 
consistent  with  the  demands  of  operations  performed  on  the  signal. 

The  second  alternative  to  a  single  channel  theory  derives  from  the  proposition  that  some,  but  not 
all,  operations  performed  by  the  human  information  processing  system  are  channeled  through  the  limited 
capacity  central  mechanism  (Posner  and  Keele,  1970).  Thus,  operations  which  do  not  require  the  swchanlam 
may  proceed  in  parallel  without  ever  interfering.  While  It  has  baen  suggested  that  the  limited  capacity 
mechanism  may  be  either  a  single  channel  or  a  parallel  processing  system  which  processes  multiple  signals 
with  reduced  efficiency  (Kerr,  1973),  it  may  be  that  there  are  several  limited  capacity  mechanisms  each 
of  which  is  peculiar  to  a  particular  type  of  signal,  sensory  mode,  or  operation.  It  has  been  suggested 
that  the  amount  of  interference  between  operations  depends  upon  overlap  between  factors  such  as  verbal 
or  spatial  demands  (Erooks,  1967;  Allport,  et  al,  1972).  Perhaps,  after  the  fashion  of  Spearman's  theory 
of  intelligence,  there  are  central  mechanisms  peculiar  to  each  of  several  "specific  factors"  whereas 
operations  of  a  "general"  nature  are  processed  in  parallel.  (Incidently,  Telchner  preferred  a  serial 
processing  model  and  was  confident  that  he  could  account  for  any  apparent  contradictions  before  his 
theoretical  development  was  complete.) 

Divided  attention  effects  produced  by  requiring  subjects  to  attempt  two  tasks  simultaneously  provide 
an  excellent  basis  for  evaluating  hypotheses  generated  by  any  of  the  three  variations  of  limited  capacity 
theory.  This  fact  has  been  recognized  by  several  theorists.  The  result  has  been  a  proliferation  of 
secondary  tasks  beyond  the  rather  large  number  produced  by  engineering  psychologists  during  the  1930' a 
and  1960's,  During  the  latter  era,  numerous  researchers  tailored  secondary  tasks  for  compatibility  with 
primary  tasks  and  used  them  to  evaluate  the  efficiency  of  alternative  procedural  or  man-machine  interface 
designs.  Although  the  results  were  valuable  to  the  specific  applications,  they  made  few  contributions 
to  a  basic  understanding  or  quantification  of  human  performance  capabilities  and  limitations  because  of 
the  lack  of  standard  methods  and  metrics.  There  were  obvious  practical  reasons  for  that  deficiency  which 
have  been  identified  and  discussed  by  Knowles  (1963). 

There  is  also  some  justification  for  using  a  variety  of  secondary  tasks  in  exploring  issues  derived 
from  the  limited  capacity  mechanism  theories.  However,  the  Sternberg  task  and  associated  model  of  in¬ 
formation  processing  stages  hold  a  great  deal  of  promise  as  a  more  or  less  standard  approach  to  both 
theory  testing  and  reserve  capacity  measurement  (Steinberg,  1969) .  In  addition  to  its  power  in  theory 
testing  and  development,  which  has'  been  demonstrated  by  the  late  George  Briggs  and  his  associates, 
primarily  under  sponsorship  by  the  USAF  Aerospace  Medical  Research  Laboratory  and  the  Office  of  Scientific 
Research  (Briggs,  et  al,  1969,  1970,  1972),  the  task  provides  a  method  for  assessing  reserve  cspaclty  for 
a  variety  of  workload  situations.  Although  relatively  simple  and  readily  learned,  the  Sternberg  task 
facilitates  manipulation  and  control  of  three  key  functions  in  information  procsssing/declslon  making 
tasks:  (1)  input,  (2)  central  processing,  and  (3)  output.  Both  input  and  output  are  readily  quantified 

in  information  metrics — a  common  measure  to  biologiats/neurophyslologists,  behavioral  scientists,  com¬ 
munications  and  computer  system  engineers  and,  hence,  potentially  p  boon  to  effective  system  engineering 
including  associated  man-machine  tradeoffs  and  functions  allocation.  Moreover,  the  Sternberg  task  is 
amenable  to  variations  in  stimulvis  (e.g,,  visual,  auditory,  tactile)  and  response  (manual,  vocal)  mode 
making  it  adaptable  to  a  variety  of  dual  task  situations. 

The  Sternberg  task  is  a  choice-reaction  task  which  facilitates  manipulation  of  the  loading  at 
Stage  2  (Central  Processing)  while  holding  the  requirements  on  the  other  stages  constant.  Stage  2 
loading  la  varied  by  changing  the  number  of  "positive  set"  items  (e.g.,  letters,  digits,  tones)  the 
oubject  must  maintain  in  memory.  In  performing  the  task,  a  subject  listens,  or  watches,  for  a  stimulus 
cue,  or  memory  "probe,"  while  maintaining  a  readiness  to  respond  via  a  response  device  "yes"  or  "no" 
depending  upon  whether  the  cue  "mutches"  or  "does  not  match"  an  item  stored  in  memory. 

In  applying  the  Sternbjrg  task  to  the  study  of  divided  attention  effects,  the  Sternberg  task  is 
first  administered  alone  to  obtain  "baseline"  data  for  3  or  more  different  "memory  loads,"  e.g.,  1,  2, 

3  and  4  items.  The  resultant  data  (using  correct  responses  only  since  incorrect  responses  are  held  to 
a  "negligible"  level)  is  used  to  plot  reaction  time  (on  the  ordinate)  against  memory  load  (on  the 
abscissa).  A  linear  equation  is  fitted  to  this  data  plot  to  obtain  a  straight  line  with  a  particular 
slope  and  y  axis  intercept  value.  Thus,  the  intercept  reflects  time  required  for  Stage  1  and  Stage  3. 

The  slope  of  the  line  reflects  central  processing  time,  i.e.,  Stage  2.  Then,  by  requiring  subjects  to 
perform  the  same  Sternberg  task  simultaneously  with  a  second  task,  which  is  treated  as  the  primary  or 
priority  task,  one  can  acquire  information  relative  to  the  nature  and  amount  of  workload  imposed  by 
the  second  task.  For  example,  if  the  slope  of  the  equation  for  the  Sternberg  data  plot  changes  between 
the  baseline  and  dual  task  conditions,  the  second  task  imposes  significant  demands  on  Stage  2  or  central 
processing.  If  the  intercept  changes,  the  demands  of  the  second  task  occur  at  Stages  1  and/or  3.  The 
amount  of  change  involved  can  be  quantified  in  terms  of  the  information  metrics,  bits  and  bits/sec.,  to 
obtain  an  indirect  indication  of  workload  associated  with  the  task  under  study. 

Tne  utility  of  the  Sternberg  task  is  readily  apparent  from  a  review  of  the  research  program  pursued 
by  Briggs  and  his  associates  at  Ohio  State  and  New  Mexico  State  Universities.  Brlgg's  research  centered 
around  efforts  to  isolate  divided-attention  effects  within  one  or  more  of  the  four  possible  stages  of 


tb :  Smith  (19(8)  reek  paradigm:  (1)  encoding  processes,  which  antall  registering,  a  mailing  and 
praprocaaaing  of  stlaulun  information;  (2)  contral  procaaaing  (dotallad  analysis  of  aaaplad  Information 
for  atitulua  ldantif icatlon  and  daflnition) ;  (3)  raaponaa  dacodlng,  and  (4)  rcaponaa  control  and  axacu- 
tlon.  Tha  aaaanca  of  the  raaaarch  program  raeulta  ara,  parhapa,  aummarlaad  moat  concisely  by  tracing 
tha  prograaalva  expan a ion  of  tha  function  proposed  by  Sternberg  (1969)  for  daacrlblng  tha  ralatlonahlp 
batwaan  choice-reaction  time  (CRT) 1  and  tha  alaa  of  tha  poaitlve  mnmory  aet.  (The  reader  ahould  recal’ 
that  tha  Sternberg  technique  requires  that  a  aubjact  flrat  memoriae  a  aat  of  ltama  of  alia  H.  Than,  at 
a  later  time,  an  item,  or  "probe:  la  praaentad  and  the  aubjact  raaponda  aa  to  whether  tha  Item  la,  or  la 
not,  a  member  of  tha  mamoriaed,  or  "poaltlva:  aat.  Tha  major  dependant  aaatura  la  tha  time  (CRT)  from 
preaantatlon  of  tha  "probe"  Item  until  tha  raaponaa  la  executed.) 

Sternberg  expreaaed  tha  function  ao: 

RT  -  a  +  b(K) 

Data  collected  by  Swenson  and  Brlgga  (1969)  ahowad  a  logarithmic  ralatlonahlp  batwaan  CRT  and  memory 
load  which  lad  to  tha  postulation  that  raaponaa  tine  la  a  function  of  central  procaaaing  uncertainty 
(He),  a  metric  from  information  or  coemninication  theory  (Shannon,  1949).  Hence,  Sternberg* a  axpraaalon 
was  modlfiod  to  read: 

RT  -  a  +  b(Hc) 

Subaaquantly,  Swanson  _nd  Brlgga  (1969)  demonstrated  that  the  intercept  constant  (a)  was  linearly 
related  to  the  amount  of  Information  transmitted  (Ht),  a  conmunication  theory  metric  of  response  accuracy; 
thus,  the  axpresclon  became: 

RT  -  c  +  d(Ht)  +  b(Hc) 

An  experiment  by  Brlgga  and  Blaha  (1969)  suggested  that  b  could  be  expreaaed  in  terms  of  tha  number 
of  displayed  Items  to  be  classified  (D)  and  the  equation  was  modified  again,  thus: 

RT  -  c  +  d(Ht)  +  e(Hc)  +  f(HcD> 

Briggs  and  Swanson  (1970)  next  varied  the  response  load  (R)  in  an  experiment .  The  reaults  showed 
that  it  could  be  partialed  out,  thus  quantifying  still  another  component  of  performance  and  Che  expres¬ 
sion  was  now: 

RT  -  i  +  j(Ht)  +  h(R)  +  e(Hc)  +  f(HcD) 

By  relating  this  resultant  equation  to  the  Smith  Information  processing  paradigm,  Briggs  (1972) 
made  estimates  of  the  time  required  for  specific  functions  of  the  human  information  processing  system. 

For  example: 


Preprocessing  cine:  180-280  mac 


Stimulus  sampling  rate: 

6.5 

blts/sec 

Recoding  (Teichner'a  s-s  translation): 

25 

blts/sec 

Transfer  from  long-term  memory  to  active  memory: 

39 

blts/sec 

Stimulus  classification: 

16 

blts/sec 

Response  Decoding: 

6 

bits/sec 

This  Is  the  type  of  quantitative  information  and  generic  classification  scheme  wnlch  is  needed  to  permit 
the  desired  state-of-the-art  advance  in  analytic/predictive  methodology  to  effectively  complement  task 
and  time  line  analyses  during  system  design.  Of  course,  a  great  deal  of  theory  development  and  teatlng 
remains  to  be  done. 

In  1974,  Biggs,  Johnson  and  Shiner  took  a  step  toward  integrating  the  Sternberg/ Sari,  th  information 
processing  paradigm  with  fundamental  decision-making  research  by  using  a  Bayesian  decision  expression 
to  account  for  the  sequence  of  decisions  mads  by  a  subject  in  a  classification  task.  Thus,  the  link 
has  been  established  betveen  simple  choice  behavior  and  more  complex  decision  processes  to  suggest 
something  of  the  potential  for  expanding  and  validating  basic  performance  theory  applicable  to  critical 
cosmand-control-coamunication  system  design  issues. 

Workload  Assessment  as  an  Aid  to  Design 

Questions  concerning  the  impact  of  digital  avionics  for  pilot  workload  have  provided  an  opportunity 
for  preliminary  testa  of  both  performance  theory  and  the  divided  attention  paradigm  in  an  applied  aetting 
(Crawford,  Pearson  and  Hoffman,  1978)  The  opportunity  developed  aa  follows: 

The  evolution  of  compact  digital  computers  has  xnde  possible  the  development  of  digital  avionics 
information  systems.  Such  systems  promise  a  number  of  advantages  to  both  aircraft  designers  and  users. 
For  example,  when  interfaced  with  multipurpose  cathode  ray  tube  displays  and  multifunction  Bvitches, 


‘See  Woodworth  and  Schlosberg  (1955)  for  a  review  of  Bonders  classic  research  on  simple  and  disjunctive 
reaction  time  In  1868. 
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digital  computation  and  atorage  capalilitier  can  be  uaad  to  raduca  the  number  of  dedicated  inatrumanta 
competing  for  cockpit  panel  area.  Information  which  la  not  required  by  the  pilot  on  a  contlnuoue  or 
frequent  baaia  can  ba  atored  and  praaanted  m  demand  wither  automatically!  ae  related  programed  miaaion 
o vent a  tranaplra,  or  in  reaponae  to  manual  control  actiona  (Zipoy ,  Premaalaar,  Gargett,  Belyea  and  Hall, 
1970).  And  with  reduced  demands  for  panel  apaca,  it  will  be  eaaler  to  locate  the  multipurpose  controla 
and  displays  in  prime  reach  and  v^ewlt^  areaa. 

However,  experienced  pilots  are  troubled  by  the  prospect  of  possible  added  activity — both  mental  and 
physical — required  to  gain  access  to  information  which  is  normally  on  dedicated  instruments.  Should  the 
demand  for  such  activities  occur  during  peak  operator  workload,  the  Impact  on  mission  aucceaa  might  not 
be  offset  by  the  Increased  calculating  power,  speed,  or  accuracy  afforded  by  the  digital  processor. 

Hence,  a  study  was  planned  to  evaluate  the  impact  of  multipurpose  control/display  tasks  on  the  pilot's 
reserve  capacity.  Of  partucular  Interest  waa  the  question  as  the  whether  or  not  the  maintenance  of  know¬ 
ledge  of  procedures  associated  with  multifunction  keyboard  operation  reduced  the  operator's  reserve 
capacity  for  making  choices  or  decisions  such  aa  might  be  required  to  handle  contingency  situations  during 
a  miaaion.  Another  purpose  of  this  study  was  to  investigate  the  compatibility  of  keyboard  operations  with 
continuous  flight  control  tasks. 

A  computer-based  simulator  was  used  to  present  and  score  the  task  situations  investigated  (Brandt  and 
Wartluft,  1975).  Of  the  three  different  tasks  involved,  two,  flight  control  and  coamunications/IFF 
switching  functions,  represented  actual  tasks  in  aircraft  systems.  The  third  was  a  variation  of  the 
Sternberg  task  which  served  as  a  test  to  measure  cognitive  reserve  capacity  under  various  primary  task 
conditions.  All  three  tasks  were  implemented  within  a  fixed-baae  cockpit  simulator. 

The  front  panel  of  the  cockpit  was  equipped  with  three  CRT-type  displays.  The  center  display  was 
used  to  present  Information  concerning  basic  flight  parameters  in  a  moving  tape  format.  The  cockpit  also 
contained  a  throttle  with  afterburner  switch  (left  side  panel)  and  a  center-mounted  joystick  control  which 
were  used,  in  combination  with  the  displayed  flight  information,  to  "fly"  various  maneuvers.  Printed 
computer  outputs  of  simulator  performance  data  Included  both  mean  absolute  and  root  mean  square  error 
relative  to  specified  control  values  based  on  "fly  to"  instructions  for  altitude,  heading,  bank  angle, 
pitch,  indicated  airspeed,  vertical  velocity,  angle-of-attack,  and  g-load. 

Between  the  front  instilment  panel  and  left  side  panel  was  a  multifunction  keyboard  (MFK) .  This 
MFK,  In  combination  with  the  CRT  on  the  upper  left  of  the  front  panel  and  a  numerical  entry  keyboard, 
which  was  also  located  on  the  Instrument  panel  (lower  left),  was  used  to  simulate  a  multifunction  inter¬ 
face  with  digital  avionics  subsystems.  Subsystems,  functions  and  states  were  displayed  on  the  CRT  to 
complement  the  feedback  afforded  by  back-projected  legends  on  the  MFK  push  button  faces. 

The  Sternberg  task  procedure  used  In  this  study  was  as  follows:  At  the  start  of  an  experimental 
session,  the  experimenter  read  to  the  subject  a  set  of  3 ,  2,  4  or  6  letters  of  the  alphabet.  The  subject 

was  asked  to  retain  the  set  in  memory  during  the  succeeding  block  of  trials.  The  tour  sets  used  were  as 

roilows:  A,  AH,  AHJQ  and  AHJQSX.  (Such  sets  are  referred  to  as  "positive  sets.")  During  the  block  of 
trials  the  subject  was  presented  (via  a  cassette  tape  player  connected  to  his  headset)  a  series  of  test 
stimuli  or  "probes"  to  which  he  was  to  make  one  of  two  responses:  (1)  "yes,"  the  test  stimulus  matches 
the  positive  set,  or  (2)  "no,"  It  does  not  match,  and,  hence,  is  a  member  of  a  negative  set.  Ihe  negative 

aet  included  the  9  letters,  B,  C,  E,  F,  G,  I,  L,  R  and  Y.  Negative  and  positive  stimuli  occurred  with 

equal  probability  (.5).  Letters  within  the  two  sets  also  occurred  with  equal  likelihood.  The  average 
Inter-stimulus  Interval  was  5.5  seconds  and  ranged  from  3  to  7  seconds.  "Yes"  was  indicated  by  the  sub¬ 
ject's  pushing  forward  on  a  thiaab  switch  on  the  joystick  continller  used  for  flight  control;  "no"  was 
Indicated  by  moving  the  thumb  switch  backward,  i.e.,  toward  the  cubject.  Reaction  times  were  scored 
automatically  to  the  nearest  millisecond.  If  a  subject  did  not  respond  within  2  seconds  the  trial  was 
scored  "no  response." 

Central  processing  uncertainty  (Hc)  values  for  this  study  are:  1.00,  1.50,  2.00  and  2.31  bits  for 
the  1-,  2-,  4-and  6-ltem  memory  sets  respectively.  Because  there  is  always  »;  2 -choice  response,  response 
uncertainty  (Hr)  "1.0  bit  in  each  instance  (Attneave,  1959). 

Four  male  subjects  were  used  in  the  study.  They  were  paid  volunteer  university  student.,  with  an  age 
range  of  20-24  years.  During  the  experiment  a  nominal  cash  incentive  system  was  implemented  to  encourage 
performance.  The  amount  of  the  incentive  was  based  on  the  subject's  relative  standing  in  the  group  with 
respect  to  task  performance  criteria  for  each  session.  For  dual  task  conditions  the  incentive  value  was 
weighted  so  as  to  emphasize  priority  for  the  flight  control  task  when  It  was  present.  The  incentive  was 
weighted  in  favor  of  the  MFK  task  when  it  was  paired  with  the  Sternberg  task. 

Prior  to  the  experiment  proper  each  subject  was  trained  on  all  three  tasks.  Training  sessions  lasted 
two  hours  and  were  scheduled  2-4  times  per  week.  Each  subject  was  trained  until  task  performance  measures 
appeared  to  asymptote.  Then  each  subject  was  tested  under  six  different  conditions:  three  single  con¬ 
ditions  and  three  dual  task  conditions:  Flight  control,  MFK  and  Sternberg  choice-reaction  task,  alone; 
and  flight  control  plus  MFK,  flight  control  plus  Sternberg  task  end  MFK  plus  Sternberg  task.  When  the 
Sternberg  task  was  combined  with  MFK,  it  occurred  only  during  periods  when  the  subject  was  awaiting 
instruction  for  an  MFK  task  of  a  given  difficulty  level.  This  was  consistent  with  the  interest  in  mea¬ 
suring  cognitive  loads  associated  with  anticipation  of  MFK  tasks  rather  than  actual  performance  of  them. 

The  single  task  conditions  preceded  the  dual  task  conditions  for  all  subjects. 

The  four  levels  of  MFK  task  difficulty  Investigated  were  quantified  in  terms  of  the  number  of  bits 
of  Information  transmitted  via  the  keyboard  in  performing  the  tasks.  The  average  value  for  each  level  was: 
T-7  bits;  11-11  bits,  III-17  bits  and  IV-26  bits. 

The  type  of  maneuver  "flown"  was  the  Independent  variable  for  the  flight  control  task.  Although 
seven  maneuvers  were  flown,  preliminary  analyses  showed  that  not  all  maneuvers  were  diacrlmlnable  in  terms 
of  the  weighted  tracking  error  scores.  Hence,  the  maneuvers  were  combined  into  two  groups  labelled  "easy" 
and  "difficult."  "Easy"  maneuvers  Included  straight  and  level  flight  and  level  turns.  "Difficult" 
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MMin'iri  ware  climbing  and  diving  turn*.  The  error  ecores  (X)  were  comprised  an  follows:  X  -  (0.01)  A 
altitude  +  (0.1)  A  airspeed  for  straight  and  level  and  stall,  X  ■  (0.01)  A  altitude  +  (0.1)  A  airspeed  + 
(1.1)  A  g-load  for  straight  and  level  turns,  and  X  -  (0.005)  A  vertical  velocity  +  (0.1)  A  airspeed  +  (1.0) 

A  g-load  for  turning  divas  and  climbs.  The  delta  values  represent  average  error,  l.e.,  deviation  from  the 
prescribed  "fly  to"  value  for  the  given  flight  parameter,  per  unit  of  time  on  the  taak.  Altitude  was  mea¬ 
sured  in  feet,  airspeed  in  t  lots  and  vertical  velocity  in  feet /minute.  The  flight  parameter  combinations 
and  associated  weights  for  each  maneuver  type  were  baaed  on  pilot  opinion  and  research  findings  summarised 
In  a  separate  report  (Woodruff,  1972).  MFK  performance  on  multifunction  keyboard  tasks  was  measured  In 
terms  of  task  time  and  errors.  The  dependent  measure  for  the  Sternberg  task  was  reaction  time.  Errors 
and  failures  to  respond  within  two  seconds  were  also  recorded. 

A  simple  analysis  of  variance  (repea ted-mcanures  design)  was  applied  to  the  ucoree  for  the  flight 
control  single  task  condition.  The  difference  between  easy  and  difficult  conditions  was  statistically 
significant  (p  <  .05).  The  mean  and  standard  deviations  for  the  easy  condition  were  1.09  and  0.17. 
Corresponding  values  for  the  difficult  condition  were  5.11  and  1.51. 

The  effect  of  MFK  task  difficulty  proved  significant  statistically  (p  <  ,001).  Mean  taak  times 
(seconds)  and  standard  deviations  (in  parentheses)  for  the  four  difficulty  levels  were;  1-3.97  (0.32); 
II-5.95.  (0.53),  III-7.43  (0.68),  TV-9. 87  (0.83).  The  average  rate  of  information  transmission  via 

the  NFK  system  varied  from  1.8  blts/sec.  to  2.6  blts/sec.  across  the  four  levels  of  MFK  task  difficulty. 

The  method  of  least  squares  was  used  to  fit  a  straight  line  to  the  Sternberg  data.  The  result  is 
reflected  by  the  following  regression  equation  for  the  single  task,  or  baseline,  condition: 

RT  -  549  +  118 (He) 

Although  mean  flight  control  errer  was  greater  when  the  flight  control  task  was  combined  with  MFK 
tasks,  the  differences  were  r.ot  statistically  significant.  Similarly,  MFK  task  times  Increased  under  dual 
task  conditions,  but  the  differences  were  not  statistically  significant.  Flight  control  error  scores  were 
virtually  identical  for  flight  control  alone  aa  compared  to  flight  control  vlth  the  Sternberg  task.  The 
Sternberg  task  had  no  statistically  significant  impact  on  MFK  task  time. 

The  method  of  least  squares  was  used  to  fit  linear  equations  to  Sternberg  response  time  data  for  each 
dual  taak  condition.  This  permits  comparison  of  Intercept  and  slope  values  with  those  obtained  for  the 
Sternberg  task  baseline  condition,  for  the  purpose  of  localizing  divided  attention  effects  within  the  four 
stage  information  processing  model. 

Preliminary  analysis  showed  no  significant  differences  between  levels  of  MFK  task  difficulty  in  terms 
of  slopes  and  intercepts.  Hence,  a  single  regression  equation  was  derived  for  the  combined  MFK  levels. 
Equations  for  the  resultant  three  dual  taak  conditions  are  as  follows: 

Sternberg  with  MFK  "Rehearsal"  RT  •  617  +  118(HC) 

Sternberg  with  Rasy  Flight  Control  RT  -  694  +  98 (Hc) 

Sternberg  with  Difficult  Flight  Control  RT  «  855  +  31 (Hc) 

F-testa  (Snedecor  and  Cockran,  1967)  Indicate  that  (1)  slopes  and  Intercepts  for  the  flight  control 

conditions  differ  significantly  from  those  for  the  baseline  condition,  and  (2)  the  Intercept  value  varies 
significantly  between  the  baseline  and  MFK  Implicit  rehearsal  condition. 

Interpreted  in  the  traditional  manner,  the  preceding  results  Indicate  that  the  effect  of  MFK  "implclt 
rehearsal"  is  In  the  input  or  output  stage  of  Information  processing  only.  Following  the  empirical  evi¬ 
dence  and  logic  of  Briggs,  et  al  (1972),  the  effect  is  probably  in  the  input  stage.  The  difference  in 
Intercept  values  amounts  to  a  12Z  average  increase  in  input-output  time  attributable  to  MFK  "implicit 
rehearsal." 

Active  flight  control,  on  the  other  hand,  appears  to  Impact  both  input  and  central  processing  as 
evidenced  by  differences  form  baseline  in  both  intercept  and  slope  values  for  the  regression  equation. 
Moreover,  there  is  an  Increase  in  input-output  time  (28X  and  55X  for  easy  and  difficult  flight  control, 
respectively)  and  an  increase  in  central  processing  rate.  The  central  processing  rate  for  the  baseline 
condition  is  8.47  bita/sec.  as  compared  to  10.20  bits/sec.  and  32.26  blts/sec.  for  the  easy  and  difficult 
flight  control  conditions  respectively.  This  increase  in  central  processing  rate  under  the  dual  task 
condition  is  consistent,  with  results  obtained  by  Lyons  and  Briggs  (Briggs,  et  al.  1972).  It  was  attributed 
to  the  subject’s  conducting  fewer  or  less  complete  tests  of  the  probe  stimulus  under  the  greater  loading 
conditions.  This  apparent  switch  in  mode  of  operation  in  the  central  processing  stage  may  prove  to  be  a 
valuable  aid  to  Identification  of  significant  workload  changes. 

The  observed  variations  in  Sternberg  task  response  accuracy  suggested  the  appropriateness  of  further 
information  analyses,  l.e.,  calculation  of  the  average  amount  of  information  transmitted  (which  would 
reflect  all  the  data.  Including  erroneous  responses  and  no  responses.  These  values  for  the  baseline  and 
two  levels  of  each  dual  task  condition  are  presented  below. 

AVERAGE  INFORMATION  TRANSMITTED  (Ht)  IN  BITS  FOR  STERNBERG  TASK 

H 


Condition 

1.00 

1.50 

c  2.00 

2.31 

Baseline 

.85 

.85 

.88 

.41 

Easy  MFK 

.86 

.85 

.94 

.42 

Difficult  MFK 

.84 

.88 

.86 

.32 

Easy  Flight  Control 

.82 

.77 

.79 

.27 

Difficult  Flight  Control 

.72 

.72 

.79 

.26 
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Th*«*  data  claarly  Indicate  that  the  6-item  memory  eet  (H^  «  2.31  bite)  produced  an  overload  situation 
for  every  task  condition. 

Effective  Uncertainty  Reduction.  Since  perfect  performance  is  represented  by  Ht  »  1.00  bit  in  eseh 
instance,  the  above  tabled  values  were  taken  to  represent  percentage  of  the  information  reduction  task 
effectively  accomplished  by  the  subjects.  An  information  reduction  task  la  defined  aa  one  in  which  the 
amount  of  uncertainty  associated  with  the  response  is  leas  than  that  associated  with  the  stimulus  (Coombs, 
Daves  and  TvarsWy,  1970).  Thus,  using  the  measures  of  central  processing  time,  a  set  of  "effective 
uncertainty  reduction  ratee"  were  derived  and  plotted  graphically  as  shown  In  Figure  2.  Note  the  consls- 
increane  in  efficiency  as  He  goes  from  1.00  to  2.00  bits  with  the  overload  effect  at  -  2.31  for  all 
conditions.  Further  study  of  Figure  1  suggests  that  cognitive  reserve  capacity  la  reduced  by  20,  31,  45 
and  54  percent  by  the  four  primary  tasks  (easy  MFK  "rehearsal,"  control),  respectively. 

With  regard  to  the  design  issue  addressed  by  the  foregoing  study,  it  appears  that  tasks  Imposed  by 
multifunction  switch  concept  places  demands  on  the  operator  which  may  detract  from  the  value  of  digital 
processing  capabilities  in  avionics  systems.  The  concept  necessitates  the  concentration  of  uncertainty, 
normally  distributed  smong  the  various  dedicated  Instrument  control/display  Interfaces,  at  a  single 
Interface.  Hence,  uncertainty  which  is  normally  removed  via  separate  controls  and  displays  for  each 
subsystem/ function  has  to  be  eliminated  via  keyboard  actions  on  each  occasion  that  the  operator  Interacts 
with  the  multifunction  system.  Thus,  while  the  digitally-based  MFK  system  is  relatively  efficient  in 
terms  of  action  and  information  transmission  rates,  the  tasks  are  generally  more  cos, pie*  and  take  longer 
than  corresponding  ones  for  dedicated  instruments. 

The  MFK  flight  control  simulation  and  data  appeared  to  provide  a  good  opportunity  for  evaluating  the 
practicality  of  general  functions  incorporated  by  Telchnerian  performance  theory.  One  of  the  more  complex 
MFK  task  sequences  was  selected  for  that  purpose.  The  task  Involved  the  transmission  of  40  bits  of 
information  via  13  steps  or  key  actions.  Tlechner's  theoretical  components  were  then  "mapped  on"  to  the 
MFK  task  sequence.  Then  a  second  laboratory  simulation  was  generated  by  using  cards  with  symbols  on  them 
to  model  the  same  set  of  theoretical  task  components  included  in  the  MFK  task  sequence.  The  card-symbol 
simulation  was  used  to  generate  a  set  of  performance  data  using  students  at.  the  Unlverslt>  of  New  Mexico. 
Although  this  effort  was  only  exploratory  and  has  not  been  formerly  documented,  reasonably  good  agreement 
between  task  time  means  and  variances  was  obtained  for  the  two  sequences.  Mean  task  time  for  the  card 
task  was  8.3  seconds  as  compared  to  8.7  for  the  MFK  task. 

As  a  further  step  toward  integration  of  performance  theory,  part-task  simulations  of  operator 
workloads  and  system  performance,  a  computer  programmed  model  of  the  40-bit  MFK  task  components  was 
developed  using  Systems  Analysis  of  Integrated  Networks  of  Tasks.  (SAINT  will  be  discussed  in  more 
detail  in  a  subsequent  section.)  Close  agreement  was  obtained  between  empirical  data  from  the  cockpit 
simulator  and  SAINT  modeling  output.  One  hundred-sixty  interactions  of  the  computer  model  produced  a 
mean  task  time  of  8.8  seconds. 

Real-time  simulations  of  operational  tasks,  as  described  above,  are  an  essential  part  of  the  theory 
development  and  testing  process  which  muse  precede  the  achievement  of  an  adequate  analytic,  descriptive 
and  predictive  data  base  to  effectively  support  workload  allocation  In  man-machine  systems. 

Physiological  Correlates  of  Performance 

Another  line  of  research  promising  significant  insights  into  the  basis  of  human  workload  capabilities 
and  limitations  at  the  neurophysiological  level  as  well  as  providing  intermediate  workload  assessment  aids 
Involves  the  measurement  of  physiological  correlates  of  performance.  In  1934,  Luckiesh  at:d  Moss,  lighting 
experts,  reported  data  on  the  relationship  between  heart  rate  and  Illumination  level  for  a  reading  task. 

The  data  showed  decrements  In  mean  heart  rate  as  a  function  of  task  duration;  moreover,  the  Tower  the 
lighting  level,  the  greater  the  decrement.  Luckiesh  and  Moss,  Interpreted  the  finding  as  indicative  of 
the  greater  amount  of  effort  required  under  low  light  level  conditions.  However,  M.  E.  Bitterman  (1948) 
in  reviewing  the  lighting  research  literature  completely  discredited  this  notion  of  Luckiesh  and  Moss  In 
the  following  words:  "....  everything  we  know  about  cardiovascular  functioning  would  lead  to  quite  the 
opposite  conclusion,  i.e.,  that  heart  rate  is  directly  rather  than  inversely  related  to  the  cost  ol  work. 
Heart  rate  is  positively  correlated  with  metabolic  rate  which  we  know  to  be  a  direct  index  of  energy 
expenditure,  and  Hadley  (1941)  has  found  a  positive  correlation  between  heart  rate  and  muscular  tension 
which  Dr.  Luckiesh  himself  accepts  as  an  index  of  exertion  in  visual  work." 

Whether  Bitterman  was  correct  or  not  In  his  criticism  of  Luckiesh  and  Moss,  It  is  interesting  to 
note  that  they  might  have  had  a  basis  for  appeal  in  the  research  of  s  physiologist,  Darrow,  who  took  an 
apparently  corroborative  position  in  1939 — five  years  after  Luckiesh  and  Moss  published,  but  prior  to 
Bitterman' s  review. 

Darrow  (1939)  reported  data  to  support  his  postulation  that  both  noxious  stimuli  and  mental  activity 
involving  "associative  processes"  are  accompanied  by  cardiac  acceleration  in  contrast  to  attention  to 
sensory  stimuli  requiring  "no  extensive  association  of  ideas"  which  is  accompanied  by  cardiac  deceleration. 

Twenty-six  years  later,  Lacey  (1965),  having  reviewed  a  large  number  of  related  experimental  findings, 
rephrased  and  expanded  Harrow's  postulation  by  suggesting  that  behavioral  arousal,  electrocardiacal 
arousal,  and  autonomic  arousal  are  different  forms  of  arousal  and  that  the  associated  activation  processes 
reflect  the  Intended  aim  or  goal  of  behavior  as  well  as  its  Intensive  dimension.  In  elaborating,  Lacey 
noted  that  an  increasing  number  of  psychophysiological  experiments  demonstrated  that  different  stimulus 
situations  reliably  produce  different  patterns  of  somatic  response.  Listening  to  auditory  stimuli,  looking 
at  pictures,  tapping  telegraph  keys,  warm  and  cold  stimuli — each  condition  produces  a  different  pattern 
of  somatic  responses  (Davis,  ec  al,  1955;  Davis,  1957).  To  Illustrate,  reception  of  external  stimuli, 
with  no  motor  response  required,  produces  a  heart  rate  decrease  concomitant  with  the  move  "typical" 

Increase  in  other  autonomic  responses,  e.g.,  palmar  conductance  (Lacey,  1959;  Lacey,  et  al,  1963;  Obrist, 
1963). 
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Without  going  into  a  detailed  review  of  evidence  cited  by  Lacey  with  regard  to  underlying  physio¬ 
logical  mechanisms  and  the  complex  nature  of  relationships  between  the  cardiac  response  and  cortical 
activity,  perhaps  it  will  suffice  for  the  purpose  of  this  general  discussion  to  use  Lacey's  findings  as 
an  indication  of  the  potential  value  of  physiological  correlates  of  behavior  as  a  relatively  unobtrusive, 
objective  technique  for  analysing  task  performance  at  the  cognitive  level  and  obtaining  guidance  with 
respect  to  the  stage,  or  stages,  at  which  work  overload  occurs  for  ar  ’  given  individual,  or  group  of 
Individuals. 

Lacey  and  associates  began  by  presenting  eight  "stresaor-situations"  in  different  orders  to  three 
samples  of  subjects.  The  situations  could  be  ordered  along  a  continuum  in  that  some  required  only 
attentive  observations  of  the  environment,  e.g.,  looking  at  sit  intermittently  presented  light,  while 
others  involved  increasingly  greater  amounts  of  internal  cognitive  functioning — retrieval  of  information 
from  memory  and  problem  solving  activity,  aa  in  mental  arithmetic.  The  results  consistently  showed  that 
sensory  Intake  was  associated  with  cardiac  deceleration  and  restraint  of  systolic  blood  pressure  whereas 
tasks  at  the  other  end  of  the  continuum  (internalized  cognitive  processing)  produced  large  Increase  in 
heart  rate  and  blood  pressure-  On  the  other  hand,  respiratory  rate  and  palmar  conductance  showed  the 
non-specific  or  nondlscrlmlnant ,  actuation  pattern  consistent  with  Cannon-based  "arousal"  or  "activation 
theory."  Thus,  depressor-decelerative  processes  are  associated  with  facilitation  of  environmental  Intake; 
pressor-accelerative  processes  with  filtering  out  irrelevant  stimuli  which  interfere  with  central  cogni¬ 
tive  functioning.  This  finding  was  supported  by  Obrist  (1963)  ujing  a  different  sample  of  subjects  and 
different  stimulus  situations.  Confirmatory  evidence  was  obtained  from  additional  studies  which  showed 
(1)  attention  to  visual  and  auditory  stimuli  to  produce  cardiac  deceleration  while  respiratory  rate 
Increased,  (2)  "thinking"  to  produce  cardiac  acceleration,  and  (3)  the  more  "analytic"  the  child,  the 
greater  the  acceleration  (Kagan  and  Rosman,  1964;  Kagan  and  Lewis,  1965;  Lewis,  et  al,  1965).  Moreover, 
in  reaction  time  experiments,  Lacey  has  found  that  the  greater  the  cardiac  deceleration  in  anticipation 
of  the  stimulus,  the  faster  the  Motor  response. 

In  suomary,  T.acey  concludes  that  different  fractions  of  autonomic,  electroencephalographlc,  and  motor 
response  are  mediated  separately  by  mechanisms  which  are  clearly  dissociable  although  they  may  be  closely 
related.  He  suggests  that  the  biological  utility  of  the  dissociation  resides  in  the  capability  of  the 
different  fractions  of  response  to  influence  cortical  and  subcortical  functioning  different,  sometimes 
opposing,  ways. 

Kibler  (1967),  in  an  Aerospace  Medical  Research  Laboratory  study  effort,  sought  to  bridge  the  gap 
between  applications  and  laboratory  research  on  the  different  cardiac  response-stimulus  situation 
relation  hips  by  means  of  a  vigilance  experiment.  The  resultant  data  shoved  a  positive  relationship 
between  tne  extent  of  stimulu-orlented  cardiac  deceleration  and  detection  efficiency  during  a  x  1/2  hour 
vigil.  The  study  vas  regarded  as  a  significant  step  toward  developing  an  independent  measure  of  alertness 
during  vigilance  taske.  Subsequently,  an  unpublished  pilot  study  by  Crawford  and  Bachert,  also  of  the 
Aerospace  Medical  Research  Laboratory,  showed  a  trend  toward  Increased  cardiac  deceleration,  and  reducad 
slntis  arrhythmia  (the  tendency  of  the  normal  heart  rhythem  toward  irregularity),  as  a  function  of  decreased 
slgnal-to-nolse  ratios,  produced  by  adding  clutter  to  a  simulated  airborne  digitized  radar  return  display. 

In  the  laboratory,  Kalobtek  (1971)  has  found  significant  reduction  in  sinus  arrhythmia  as  a  function 
of  Increases  in  the  signal  rate  in  a  perceptual  motor  task.  Kalsbeek  (1968)  also  reported  data  indicative 
of  reduced  arrhythmia  as  a  function  of  Increased  task  demands  in  a  flight  control  simulation. 

Cardiac  data  obtained  from  Navy  carrier  pilots  flying  missions  over  Southeast  Asia  showed  average 
heart  rates  to  be  substantially  higher  during  launch  and  recovery  than  during  bc.mb  xuns  (Plattner,  1967). 
These  results  were  Interpreted  to  mean  bombing  was  r,  less  demanding  task  than  take-off  and  launch,  which 
was  somewhat  surprising  to  the  researchers  although  not  necessarily  to  all  pilots.  It  is  conjectured 
that  analysis  of  the  specific  stimulus-situations  Involved  in  accordance  with  Lacey's  theoretical  position 
might  have  reversed  the  interpretation. 

Some  attempts  to  use  cardiac  response  measurement,  In  combination  with  a  battery  of  other  physio¬ 
logical  correlates  of  performance,  Jwve  proven  less  than  satisfactory.  One  possible  explanation  for 
difficultly  recognized  in  at  least  one  such  attempt  is  the  failure  to  differentiate  between  actual 
workload  and  performance,  i.e.,  removal  of  flight  instrument  information  produced  a  decrement  in  flight 
control  performance,  wnlch  was  interpreted  as  an  overload  condition;  but  it  also  leduced  the  information 
load,  which  if  effectively  processed  would  have  resulted  in  improved  performance.  Careful,  accurate  data 
collection  and  analysis  is  also  essential  to  effective  use  of  physiological  data  within  theoretical 
contexts  as  posed  by  Lacev. 

Nevertheless,  the  evidence  with  regard  to  cardiac  response  "situational-specificity,"  is  judged  to 
be  sufficient  to  warrant  further  investigation  of  the  measure  under  carefully  controlled  conditions 
employing  the  Sternberg  task  and  information  processing  paradigm  to  acsess  relationships  between  increasing 
demands  cf  the  various  "stages,"  stimulus  input,  central  process^rq,  etc.,  as  well  as  the  transformation 
processes  (classification,  conservation,  condensation,  creation,  .  cc.)  at  different  demand  levels.  The 
ultimate  potential  advantages  to  this  program  are  #t  least  two- fold:  (1)  Increased  validity  of  the  per¬ 
formance  theory  developed  and  (2)  a  relatively  unobtrusive,  objective  workload  assessment  technique  for 
use  during  actual  system  operations  and  during  system  simulations  to  precisely  Identify  crew  functions 
which  require  automated  aiding  via  digital  processing  capabilities. 

Evoked  potential  measurement  appears  to  be  another  technique  with  reasonable  promise  for  facilitating 
performance  theory  and  workload  assessment  developments.  Instrumentation  for  obtaining  average  evoked 
potentials  involves  the  attachment  of  electrodes  to  appropriate  areas  of  the  scalp  in  the  same  manner  as 
required  to  produce  an  EEG,  The  continuous  electrical  activity  so  obtained  is  conducted  through  an 
amplifier  to  an  averaging  computer.  A  stimulus  may  then  he  presented,  simultaneously  averaged  by  the 
computer.  The  resultant  measure  of  the  nonrandom  activity  is  the  average  evoked  response  (Childers  and 
Perry,  1969'. 


I  nsimtiihm.'tixIaiL  t..  e  ,\iU, 


64 


This  response-averaging  technique,  which  e,  Hances  the  signal- to-noise  ratio,  also  accurately 
identifies  specific  psychological  variables  with  components  of  the  EEG.  A  stimulus  initiates  a  series  of 
physiological  processes  related  to  both  perception  and  preparation  for  an  overt  behavioral  response. 

Analysis  of  the  electrical  activity  between  stimulus  and  response  can  provide  useful  Information  concerning 
factors  such  as  the  timing,  process  speed  and  anatomical  location  of  physiological  events  associated  with 
the  psychological  processes  Involved.  Cognitive  and  motivational  as  well  as  timius  and  response  varlsbles 
may  be  Included  in  the  experimental  situations  achieved  via  this  arrangement  (Vaughan,  1966) . 

Theoretical  issues  related  to  the  limiting  central  mechanism  and  serial  vs.  parallel  processing 
appear  to  be  most  amenable  to  Investigation  via  evoked  potential  methodology.  Thr.»  value  of  evoked 
potential  measures  as  an  aid  to  ucsessment  of  workload  under  operational  or  system  simulation  conditions 
is  yet  to  be  established.  However,  Weissman  (1969)  in  promoting  the  use  of  average  evoked  potentials  for 
assessing  the  level  that  the  technique  has  no  equivalent  when  it  comes  to  minimizing  Interference  with 
the  subject.  Hence,  evoked  potential  measurement  must  be  considered  possibly  as  an  unobtrusive  method 
for  workload  assessment  under  flight  test  or  operational  conditions. 

It  has  been  suggested  that  a  complete  battery  of  psychophysiologlcal  instruments  might  include  the 
measurement  of  heart  rate,  electrical  activity  of  the  brain,  muscle  activity,  akin  resistance,  blood 
pressure,  sinus  arrhythmia,  average  evoked  potentials,  urinalysis,  parotid  fluid,  pupillary  response, 
metabolic  rate,  oxygen  uptake  and  ventilatory  rate  (Cartner  and  Murphy,  1976). 

However,  because  of  the  prevalent  interest  in  cognitive  or  information  processing/decision  making 
activities,  the  EKG  and  EEG  domains  currently  have  the  greatest  appeal  aa  primary  sources  of  peychophysio- 
logical  data  and  continued  exploratory  development. 

Analytic/Predictive  Methodology 

The  final  thrust  of  a  comprehensive  workload  assessment  development  effort  must  Include  the  incorpo¬ 
ration  of  the  results  of  prolucts  of  the  thrust  areas  Into  analytic  and  predictive  methods.  First  the 
performance  theory  and  quanti  tative  functional  relationships  between  human  input-output  parameters  will 
have  to  be  reflected  in  task  analytic  procedures.  The  purpose  of  task  and  analysis  is  to  provide  the 
basic  building  blocks  for  sut sequent  human  engineering  analyses  during  system  design  and  development. 

Task  analysis  entails  the  specification  of  tasks  to  be  accomplished  by  human  operators  including  the 
behavioral  requirements  of  the  tasks,  kinds  of  discriminations  to  be  made,  decision  making,  motor  responses, 
etc.  From  the  task  analysis  estimates  of  error  rates,  time  line  projections  and  personnel  aptitude  and 
training  requirements  must  b;  made. 

Task  analytic  methodology  as  It  exists  today  represents  little  more  than  the  crude  beginning  made 
some  25  or  30  years  ago.  Ct  itically  needed  research  required  to  appropriately  expand  end  validate 
esentlal  behavioral  Informat. '.on  has  not  been  forthcoming.  Consequently,  job  analyses  are  expected  to  do 
more  than  they  possibly  can.  Although  analysts  continue  to  break  work  into  smaller  elements  to  produce 
the  expected  documentation.  It  is  largely  a  reductionist ic  effort  without  sufficient  regard  to  the  mean¬ 
ingfulness  of  the  behavioral  elements  (Bryan  and  Regan,  1963). 

It  is  suggested  that  emphasis  should  be  upon  implementation  of  system  models  (mathematical  and  com¬ 
puter  simulation  models)  aa  analytic/predictive  tools  during  system  design.  It  has  been  said  that  the 
sign  of  maturity  in  systems  analysis  will  be  the  development  of  useful  models  (Shaperc  and  Bates,  1959). 

The  SAINT  methodology  promises  to  be  a  useful  vehicle  in  achieving  the  desired  advance  in  systems 
analysis  (W  tman,  Seifert  and  Duket,  1975).  (SAINT  was  referenced  briefly  in  the  eavllei  discuuslon  of 
simulation  v. .ich  was  primarily  concerned  with  real-time,  man- in- the- loop  simulations). 

SAINT  consists  of  a  symbol  set  for  modeling  systems  and  a  computer  prof ram  for  analyzing  the  models. 
SAINT  include?)  the  conceptual  framework  for  representing  systems  which  include  discrete  task  elements, 
continuous  state  vari shies  and  interactions  between  the?...  SAINT  is  not  a  uodel.  It  simply  provides  a 
framework  within  which  any  quantitatively  expressed  model,  or  models,  .■ia.y  be  described  and  exercised. 

And,  since  it  was  designed  for  addressing  human  performance,  in  particular,  within  systeu  contexts,  it  is 
potentially  an  ideal  vehicle  for  integrating  generic  behavioral  functions  such  as  are  advocated  within 
human  performance  theory.  The  resultant  computer  models  of  systems  concepts  could,  then,  readily 
evaluate  the  probability  and  source  of  system/task  demands  which  exceed  operator,  or  crew,  workload 
handling  capability. 

In  applying  SAINT,  systems  are  represented  as  graphical  networks  of  task-activities  with  which  one 
or  more  operators  interact.  Each  task  is  described  with  respect  to  how  its  performance  relates  to  other 
tasks  within  the  system  of  inte-eat.  The  graphical  analysis  is  then  input  to  the  SAINT  couputer  program 
for  automated  performance  asset  ment.  Using  Monte  Carlo  techniques,  the  SAINT  program  permits  simulation 
of  probabilistic  task  performance  and  precedence  relationships  while  collecting  estimates  of  system 
performance  at  the  same  time.  Capabilities  are  included  for  simulating  continuous  or  discrete  system 
atate  variables  and  their  response  to  discrete  control  task  execution  and  for  dynamic  modification  of  both 
operator  and  aystesi  characteristics  as  dictated  by  internal  or  external  t  Inula  ted  "events"  (Kuperaan  and 
Seifert,  1975).  Thus,  this  computer  modeling  technique  permits  fast  time  evaluation  of  human  engineering 
design  alternatives,  and  other  human  factors,  e.g.,  skill  level,  training  and  motivation,  within  system 
contexts.  However,  it  is  just  as  dependent  on  a  valid  scientific  base  as  conventional  task  anslytlc 
methodology. 

Preliminary  attempts  have  been  made  to  apply  SAINT  to  current  USA?  system  design  problems.  For 
example,  a  SAINT  model  of  the  cockpit  simulator  used  to  investigate  mult? function  switching  and  multi¬ 
purpose  displays  for  the  Digital  Avionics  Information  System  Advanced  Development  program  was  developed 
(Kuperman  and  Seifert,  1975).  Model  networks  were  developed  for  both  conventional  dedicated  avionics 
subsystem  instruments  and  the  multipurpose  controls  and  displays.  Exercise  of  the  model  provided 
estimates  of  performance  within  the  limits  of  available  empirical  data.  Conclusions  of  the  investigators 
Included:  (1)  The  SAINT  simulation  techniques  are  readily  applicable  to  piedictlve  modeling  of  new 

concepts  of  man/machine  Interaction.  (2)  The  techniques  are  appropriate  to  the  study  of  the  theories  of 
human  performance  and  to  evaluation  of  experimental  metrics  for  their  Implementation. 
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INTRODUCTION 

A  central  goal  of  a  military  workload  analyst  is  to  understand  the  determinants  of  mission  success 
in  a  military  setting.  The  emphasis  is  on  the  human  determinants  of  mission  success  with  particular 
consideration  to  how  the  human  uses  the  systms  he  is  given  to  accomplish  the  mission  at  hand.  In  quanti¬ 
tative  workload  analysis  the  final  goal  in  many  Instances  is  to  provide  various  numerical  measures  of 
mission  performance.  For  example,  when  examining  a  bombing  mission,  a  workload  analyst  using  suithcmatical 
models  might  attempt  to  estimate  the  probability  that  bombs  would  land  on  target.  Specifically,  he  might 
attempt  a  statement  such  as:  "The  estimated  circular  error  probable  is  250  feet  given  the  present  work¬ 
load  conditions."  Other  measures  he  might  estimate  Include  susssary  statistics  such  as  anticipated  loss 
rates  against  specific  enemy  defensive  configurations,  and  rates  of  overall  success  against  enemy  targets, 
and  these  summary  statistics  will  be  of  particular  interest  belov. 

A  workload  analyst  studies  the  system  under  consideration  to  determine  its  capabilities  and,  when 
appropriate,  he  designs  system  changes  or  nodif lcations  with  a  view  to  Improving  system  performance.  The 
■sain  purpose  of  this  paper  is  to  suggest  that  the  workload  analyst  attempt  to  evaluate  his  proposed  design 
modifications  within  the  framework  of  a  quantitative  cr  sasi- quantitative  cost/benefit  tradeoff.  This  is 
particularly  appropriate  when  the  analyst  hns  developed  relevant  metrics  describing  system  performance 
both  with  and  without  the  system  modification. 

A  workload  analyst  can  suggest  a  wide  variety  of  system  changes  ranging  from  hardware  modifications 
to  changes  in  system  operating  procedures.  Whatever  the  changes  suggested,  a  workload  study  in  the  mili¬ 
tary  setting  can  be  represented  by  a  cost/benefit  table  as  shown  in  Figure  1. 

In  Figure  1,  the  basic  or  unmodified  system  has  effectiveness  e,  vulnerability  v,  and  cost  per  system 
c.  Subsequent  discussion  will  provide  definitions  of  e  and  v.  System  modifications  can  Improve  the 
effectiveness  of  a  system  from  the  point  of  view  of  making  the  system  more  capable  of  inflicting  losses  on 
the  enemy.  However,  Improving  the  fighting  effectiveness  of  the  system  can  increase  or  decrease  the  sys¬ 
tem’s  vulnerability,  just  as  decreasing  a  system's  vulnerability  can  either  decrease  or  Increase  the 
system's  fighting  effectiveness.  In  Figure  1,  ei,  Vj ,  and  ci  are  the  effectiveness,  vulnerability  and 
system's  cost  of  the  basic  military  system  with  system  modification  #1.  The  symbols  e2,  v2,  and  c2  are 
used  in  a  similar  manner  for  the  system  with  modification  #2. 

Perhaps  meet  readers  will  agree  that  composing  the  table  in  Figure  1  is  a  step  forward,  but,  of 
course,  still  remaining  is  the  question  of  how  to  use  the  assembled  data  for  actual  decision  making. 

Should  an  investment  of  money  be  made,  and  if  so  could  modification  #1  or  modification  #2  be  purchased, 
or  should  one  simply  recommend  that  more  elements  of  the  basic  system  be  procured?  Analytical  scenario 
modeling  can  be  a  decision  aid  in  this  circumstance,  and  this  will  be  described  in  the  following  section. 
The  method  or  methods  whereby  a  quantitative  tradeoff  table  such  as  that  shown  in  Figure  1  can  be  developed 
will  be  described  briefly  in  the  section  of  this  paper  following  the  next  concerning  analytical  scenario 
modeling. 

ANALYTICAL  SCENARIO  MODELING 

In  this  section,  system  mission  effectiveness,  e,  and  system  mission  vulnerability,  v,  will  be 
defined  in  the  context  of  analytical  scenario  modeling.  For  the  purpose  of  illustrating  the  usefulness 
of  analytical  scenario  modeling,  a  simple  example  from  among  a  class  of  combat  models  called  Lanchester 
models,  will  be  employed.  This  class  of  models  was  developed  by  Lanchester,  an  aeronautical  engineer,  in 
about  1914,  and  is  extremely  simple  in  conception  and  approach  (1,  2). 

Consider  two  opposing  forces,  Blue  force  versus  Red  force.  The  rate  of  attrition  of  the  Blue  force 
should  be  proportional  to  the  number  of  Red  systems  available,  that  1b 

dB/dt  -  -  r  X  R  Eq.  1. 

where  B  is  the  number  of  Blue  force  systems  or  elements,  R  is  the  mmiber  of  Red  force  elements,  and  r  is 
the  constant  of  proportionality  which  reflects  Red's  ability  to  reduce  Blue  force.  Similarly,  the  equiv¬ 
alent  differential  equation  for  the  attrition  of  the  Red  force  is  where  b  is  «  constant  of  proportionality 

dR/dt  -  -  b  X  J  Eq.  2. 

which  measures  Blue  force's  ability  to  reduce  the  Red  force.  If  the  Blue  force  is  identified  as  the 
analyst's  side,  the  proportionality  constant  b  can  be  identified  with  system  effectiveness  e,  and, 
similarly,  the  proportionality  constant  r  can  be  identified  with  Blue  force  system  vulnerability  v.  Thus 
the  following  equations  obtain. 

dB/dt  ■  -  v  I  I  Eq.  3. 

and 

dR/dt  -  -  e  X  B  Eq.  4. 

and  these  equations  provide  quantitative  definitions  of  effectiveness  and  vulnerability.  These  equations 
are  easily  solved  to  provide  B  and  R  as  functions  of  time  t.  These  solutions  are  shown  below  for  the 
interesting  but  highly  simplified  case  where  e  and  v  ore  constants. 
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B(t)  -  (1/2/e)  {(/e  Bc  +  /v  Ro)exp(-/ev  t)  + 

(/e  B0  -  /v  Ro)exp(+/ev  t)}  Eq.  5. 

R(5)  -  (1/2/v)  {(» fi  Bc  +  Ro)exp(-/ev  t)  + 

( Ar  Sq  -  /e  B0)exp(+/ev  t)  }  Eq.  6. 

In  thr.se  equations,  B0  and  Ro  are  the  sizes  of  the  Blue  force  and  the  Red  force,  respectively,  at  time 
t  »  0  at  the  start  of  combat  (prior  to  any  losses) .  These  last  tiro  equations  describe  force  attrition 
during  a  battle.  Far  more  complex  attrition  models  are  often  developed  to  study  force  adequacy  and 
tactics.  The  suggestion  made  here  is  that  such  attrition  models  or  analytical  scenario  models  be  adapted 
and  employed  in  workload  tradeoff  analyses.  The  concept  is  to  compare  design  alternatives  against  pre¬ 
dicted  combat  outcomes,  and  to  choose  that  system  modification  which  optimizes  desired  outcomes.  This 
concept  will  be  illustrated  in  the  following  by  using  equations  #5  and  #6. 

A  natural  military  goal  is  to  reduce  the  enemy  while  minimizing  one's  own  losses.  This  military 
goal  can  serve  as  an  outcome  metric  which  can  discriminate  between  differing  system  modifications.  Other 
outcome  metrics  can  be  defined  such  as  minimizing  one's  own  losses  while  reducing  the  enemy  in  the  shortest 
time  possible.  However,  for  the  purposes  of  the  present  illustration  the  simpler  metric  of  minimizing 
losses  alone  will  be  employed. 

Examining  equations  #5  and  16,  it  can  be  noted  immediately  that  Blue  unit  will  ultimately  dominate 
the  Red  unit  if  the  quantity  /e  B0  is  greater  than  the  quantity  /5  Ro  (since  (/v  Ro  -  /e  Bo)  is'  than  a 
negative  quantity  in  equation  #6).  If  /e  B0  is  greater  than  /v  Rq ,  R  will  be  zero  at  time  t  -  t*  where 

t£  "  (l/2/ev)ln((/e  B0  +  /v  Ro)/(/e  B0  -  /v  Ro))  Eq.  7. 

and,  using  this  critical  time,  maximum  Blue  force  losses  can  be  calculated  using  the  following  formula: 

Losses],  -  B0  -  ((eB*  -  vR§)/e)>j  Eq.  8. 

Similarly,  if  /w  Ro  is  greater  than  /e  B0  Red  force  will  ultimately  dominate  Blue  force,  and  B  will  be 
zero  at  time  t  -  t*  having  lost  all  B0  systems. 

These  last  equations  will  now  be  employed  to  accomplish  an  example  tradeoff  analysis.  A  hypothetical 
cost/benefit  table  la  shown  in  Figure  2.  In  this  table,  the  fact  that  /eDB  -/Ax  250  -  500  is  less  than 
/v  R„  -  /5  x  400  -  565,  certainly  motivates  the  Blue  analyst  to  recommend  changes.  Hodlflcation  #1  allows 
Blue  to  defeat  Red  while  sustaining  loss  of  177  Blue  elements.  The  cost  of  modification  #1  is  37.50 
million  dollars  which  is  a  sum  which  would  allow  procurement  of  37  additional  unmodified  systems.  Since. 

/4  x  287  is  greater  than  /2  x  400,  the  Blue  force  augmented  by  37  elements,  would  defeat  the  Red  force, 
but  in  so  doing  the  Blue  force  would  sustain  a  loss  of  201  systems.  Thus,  modification  #1  would  be  pre¬ 
ferred  over  the  equivalent  Blue  force  augmentation. 

How  consider  modification  #2,  This  modification  allows  Blue  force  to  win  with  a  loss  of  142  units. 

The  coat  of  modification  #2  is  125  million  dollars.  With  this  money,  125  additional  unmodified  systems 
can  be  procured  to  form  a  force  of  375  fighting  elements.  With  this  size  force,  Blue  force  defeats  Red 
force  while  sustaining  losses  of  only  92  systems  indicating  that  equivalent  augmentation  of  the  unmodified 
force  would  be  preferable  to  purchasing  modification  #2.  More  complex  mathematical  models  would  allow 
consideration  of  purchases  of  various  combinations  of  modifications  #1  and  #2.  Putting  these  more  compli¬ 
cated  situations  aside,  and  simply  using  what  has  been  computed  above,  it  can  be  concluded  that  if  125 
million  dollars  were  available,  augmentation  of  the  basic  force  should  be  accomplished  without  modifying 
the  individual  elements  of  the  force.  However,  if  37.5  million  dollars  are  available  for  use,  modification 
#1  will  minimize  losses. 

It  has  thus  been  illustrated  how  analytical  scenario  models  cun  be  used  in  workload  analysis  tradeoff 
studies.  These  models  can  help  workload  analysts  define  their  earned  return  on  Investment  and  can  help 
with  decisions  concerning  modification  alternatives.  Admittedly  perhaps  the  simplest  scenario  model  has 
been  employed  here  to  illustrate  scenario  model  usefulness.  It  is  anticipated  that  real-world  decisions 
would  employ  simulations  which  are  far  more  complex  and  extremely  well  tested.  Nonetheless,  from  the 
above  very  simple  example,  the  workload  analyst  should  be  prepared  to  realize  that  in  some  Instances  it 
may  be  preferable  to  procure  more  of  an  unmodified  system  than  to  proceed  to  a  modified  system. 

CONSTRUCTION  OF  QUANTITATIVE  TRADEOFF  TABLES 

In  this  section,  the  construction  of  quantitative  tradeoff  tables  ns  shown  in  Figures  #1  and  #2,  will 
be  briefly  discussed.  These  tables  can  be  constructed  using  three  different  types  of  data  sets.  A  data 
set  of  type  #1  consists  of  data  derived  on  the  military  systems  of  interest  and  including  the  precise 
effectiveness  and  vulnerability  figures  needed  to  complete  the  tradeoff  table.  Type  #1  data  sets  are 
rarely  encountered  in  practice.  These  data  sets  can  be  developed  from  records  of  actual  combat  or  they 
can  be  developed  from  records  of  realistic  practice  or  training  encounters  where  different  systems  are 
employed  or  compared.  Thin  data  type,  when  it  is  available,  provides  the  best  and  most  direct  data  for 
tradeoff  studies. 

A  data  set  of  type  #2  consists  of  data  derived  from  the  actual  military  systems  of  interest  in  the 
tradeoff  study;  however,  the  perfurmance  measures  available  from  these  systems  are  not  the  desired  effec¬ 
tiveness  and  vulnerability  measures.  Often  in  this  setting  the  available  data  are  indirect  measures  of 
mission  performance,  or  measures  of  human  operator  workload  stress  during  mission  performance,  from  which 
the  likelihood  of  mission  failure  can  be  Inferred.  For  example,  when  the  concern  is  with  a  bombing  mis¬ 
sion,  instead  of  obtaining  the  numbers  of  enemy  targets  destroyed  per  unit  time  by  the  competing  systems, 
this  data  type  might  provide  circular  error  probable  figurer  from  which  enemy  target  destruction  would 
have  to  be  Inferred,  Still  more  indirect  data  concerning  the  bomber  performance  would  be  data  that 
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related  to  aircrew  atreaa  during  the  performance  of  trial  mlaaione.  Such  measure*  are,  for  example, 
voice  streea  measurements,  galvanic  akin  responses,  cortisol  secretion  and  the  like.  It  Is  clear  that 
with  dcta  sets  of  type  #2,  the  workload  analyst  faces  a  problem  o 2  extrapolating  from  the  available 
performance  measures  to  measures  that  are  move  relevant  to  a  military  decision  In  a  tradeoff  setting. 

A  data  ast  of  type  #1  consists  of  data  derived  from  systems  which  are  not  the  military  systems  of 
Interest  or  concern  in  the  tradeoff  deliberations,  but  are  other  military  systems  currently  In  the 
Inventory,  or  are,  as  is  often  the  case,  laboratjry  simulations  of  the  real  systems  under  consideration. 

Thus  data  sets  of  type  #3  also  pose  serious  extrapolation  problems.  In  thlu  case,  the  extrapolation 
problem  is  one  of  relating  data  from  one  system  to  the  relevant  performance  measure  applicable  to  another 
system. 

As  discussed  above,  data  sets  of  type  13  and  type  #3  retire  that  the  workload  analyst  extrapolate 
between  measurea,  or  between  military  systems,  or  both.  extrapolation  can  be  done  via  experimentation 

or  via  the  use  of  experimentation  coupled  with  the  application  of  mathematical  models.  The  use  of  mathe¬ 
matical  models  In  military  workload  analysis  has  been  outlined  In  a  previous  publication  wherein  a  coarse 
classification  of  available  modeling  techniques  Is  provided. 

SUMMARY 

The  above  discussion  has  suggested  that  military  workload  analyaes  proceed  In  the  setting  of  quanti¬ 
tative  or  saml-quantltative  tradeoff  analysis.  This  setting  is  already  quite  familiar  to  the  hardware 
engineer,  but  may  be  a  novel  setting  for  the  human  factors  workload  specialist.  The  term  seml-quantltatlve 
analysis  Is  employed  to  recognite  the  fact  that  it  will  not  always  be  possible  to  precisely  quantitate 
effectiveness  and  vulnerability  as  well  as  one  would  wish. 

The  methodB  described  In  this  report  rely  heavily  on  mathematical  modeling  techniques.  This  Is  seen 
In  the  suggestion  to  employ  analytical  scenario  modeling  In  the  tradeoff  study,  and  Is  also  seen  In  the 
suggestion  to  employ  mathematical  models  in  the  construction  of  the  tradeoff  table  from  data  sets  that  are 
not  directly  applicable.  While  mathematical  models  can  be  extremely  us^t'-t  and  cost  effective  in  appli¬ 
cation,  they  must  be  used  with  sober  caution.  Mathematical  models  are  best  employed  with  an  attitude 
which  considers  the  mathematical  models,  not  as  a  replacement  for  traditional  methods,  but  as  an  adjunct 
to  cossaonly  employed  methods  of  analysis  and  deliberation.  Mathematical  models  should  In  no  way  displace 
the  direct  use  of  experience  and  the  direct  consideration  of  empirical  data.  Rather,  mathematical  models 
should  be  used  to  enhance  ant  highlight  the  utility  of  available  data  sources.  The  analyst's  dictum  "never 
believe  your  mathematical  model"  is  a  wise  rule  which  is  simply  a  statement  of  caution  intending  to  remind 
the  analyst  that  mathematical  models  are  as  fallible  as  any  other  human-contrived  decision  aid. 

CONCLUSION 

This  report  has  discussed  a  method  of  tradeoff  analysis  as  applied  to  workload  analysis  in  the  military 
environment.  It  is  suggested  that  workload  studies  be  performed  in  a  tradeoff  setting  which  allows  the 
analyst  to  estimate  the  return  on  investment  he  has  earned  through  his  proposed  system  modifications. 

The  methodologies  described  employ  mathematical  modeling  techniques,  and  it  is  reinforced  that  these 
techniques  are  an  adjunct  t j,  and  not  a  replacement  of,  more  traditional  methods  of  workload  analysis. 
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INTRODUCTION 

Operator  workload  for  the  task  of  vehicle  manipulation  perhaps  could  bn  defined  as  the  sum  of  sen¬ 
sory  Inputs,  psychoaotor  responses,  and  cognitive  processes.  Sensory  inputs  to  the  operator  are  util¬ 
ised  to  direct  control  manipulation,  obtain  feedback  as  to  degree  of  effectiveness  of  the  control 
moveaier.cs,  and  to  monitor  system  status.  This  Input  workload  Is  combined  with  the  pa  yc  homo  ter  workload 
required  to  move  the  vehicle  controls  as  dictated  from  the  sensory  inputs  and  feedback  modes.  More 
simply  stated,  workload  measurements  can  be  derived  by  objectively  measuring  the  input  and/or  output  of 
the  operator. 

The  ability  to  manipulate  an  aircraft,  aa  well  as  a  tank,  car,  or  any  other  vehicle,  is  directly 
related  to  Inputs  or  cues  the  operator  receives  from  the  environment.  Of  these  perceptual  inputs 
(tactile,  visual,  auditory,  etc.)  required  to  fly  an  aircraft,  visual  cues  are  considered  vital.  E. 
Hartman  has  even  estimated  that  vehicle  operators  acquire  over  90%  of  their  required  information 
visually.  Processing  and  integrating  these  visual  cuea  allow  the  pilot  to  detect  the  aircraft 'a  rela¬ 
tive  stability,  ground  refernnees,  and  provide  feedback  from  his  control  functions.  During  flight 
conducted  under  Instrument  meteorological  conditions  (IMC) ,  lack  of  cues  from  the  environment  outside 
the  aircraft  requires  the  pilot  to  obtain  the  necessary  visual  information  from  Instrument  displays. 

As  a  consequence,  there  exists  the  need,  independent  of  visual  conditions,  to  determine  what  cues  or 
visual  workload  are  required  to  achieve  maximum  pilot  efficiency  with  mlminal  fatigue-induced  errors 
and  safe  mission  accomplishment. 

A  great  variety  of  apparatus  and  techniques  have  been  developed  for  the  study  of  visual  performance/ 
workload  (2,  3,  A).  One  of  the  earlier  devices  was  a  smoked-drum  Kymograph  attached  to  the  sclera  of  th i 
eye.br.ll  via  fine  wire  and  barbed  hooks.  During  the  1930's,  electrooculography  (EOG'  techniques  were 
developed  which  utilised  electrodes  placed  around  the  eyes  of  the  facial  structure  to  monitor  differ¬ 
ential  voltages  as  the  eyeball  was  rotated  (5) . 

The  earliest  documented  technique  for  measuring  the  vital  performance  of  pilots  was  to  simply 
record  pictures  ot  the  operator's  face  while  he  scanned  the  Instruments  (6).  Improvements  of  this 
method  were  accomplished  by  arranging  mirrors  on  the  Instrument  panel  and  photographing  the  total 
arrangement.  Documentation  of  eye  movement  was  obtained  by  means  of  a  camera  >aounted  behind  the 
pilot,  during  analysis  a  photo  interpreter  scanned  the  film  to  determine  which  mirror  reflected  the 
eye  of  the  pilot  at  various  times  during  the  flight  (7). 

This  technique  was  further  refined  by  Kackworth  (8) .  His  approach  was  to  mount  a  lightweight 
moving  picture  camera  beside  the  operator's  head  along  with  a  series  of  mirrors  which  reflected  a  dot 
representing  the  eye's  motion.  This  dot  was  superimposed  on  photographs  of  the  scene  directly  in  front 
of  canter  line  of  the  head.  More  recently  this  same  "corneal  reflection”  technique  has  been  utilised 
by  the  US  Army  Aeromedlcal  Research  Laboratory  in  the  study  of  Army  pilot  visual  performance  during 
helicopter  flight  (9,  10). 

The  corneal  reflection  technique  is  possible  because  of  the  smooth  spherical  front  surface  of  the 
cornea.  An  Incident  beam  of  light  can  be  partially  reflected  forming  a  bright  spot  or  "highlight”  on 
the  cornea.  The  angle  of  the  reflected  light  depends  upon  the  anglu  between  the  incident  light  ray  and 
a  plane  tangent  to  the  reflecting  surface.  Since  the  cornea  forms  an  eccentric  bulge  on  the  nearly 
spherical  eyeball,  the  angle  of  this  tangential  plane  on  the  cornea  at  any  one  point  chauge.i  at  the  eye 
rotates  around  its  center  during  eye  movement.  As  a  result,  the  position  of  the  highlight  follows  the 
direction  of  movement  of  the  cornea.  The  reflected  beam  is  easily  photographed  on  film.  By  mounting 
a  camera  lens  on  subject's  head  slightly  above  and  betveen  his  eyes,  the  subject's  normal  visual  field 
can  be  recorded  and  the  highlight  can  be  superimposed  on  the  scene  to  give  a  constant  aye  reference  to 
the  eye's  highlight,  the  area  of  visual  concentration  and  the  percentage  of  time  for  eye  stabilisation 
during  any  flight  maneuver  can  be  recorded. 

Past  research  has  demonstrated  two  major  advantages  of  the  corneal  reflection  technique  for  study¬ 
ing  eye  movement.  First,  the  method  is  convenient  for  large  scale  testing  of  subjects  in  that  it 
requires  minimal  training.  Second,  these  studies  have  reported  no  significant  Interference  with  normal 
eye  movement  (11,  12).  This  laboratory  utilises  motion  picture  film  to  record  the  visual  performance 
data.  Figure  1  is  a  picture  of  the  oculomotor  lense  and  peripheral  equipment.  The  total  methodology 
is  outlined  in  USAARL  Report  No.  77-A  (13). 

Investigations  which  have  been  devised  to  collect  data  related  to  visual  performance  can  be  divided 
into  three  categories:  (1)  subjective  opinions  of  visual  performance,  (2)  objective  visual  performance 
data  during  fixed  wing  flight,  and  (3)  objective  data  during  helicopter  fl  jht.  Studies  by  Siegel  and 
MacPherson  (K),  Clark  and  Intano  (IS),  Simmons,  et  al.  (16)  have  analysed  the  opinions  of  aviators  as 
to  which  Instruments  they  felt  were  utilised  to  fly  selected  maneuvers.  However,  these  findings  do  not 
agree  with  the  research  results  of  Frezell,  et  al.  (10),  Sanders  (12),  and  Slnaons,  et  al.  (13).  These 
Investigators  have  reported  a  very  poor  agreement  between  subjective  data  and  actual  pilot  visual  per¬ 
formance.  Additional  studies  by  Milton,  Jones,  and  Fitts  (6),  Fitts,  et  al.  (7),  and  Diamond  (17)  have 
utilised  equipment  to  obtain  objective  visual  performance  data  of  aviators  during  flight  maneuvers  in 
several  fixed  wing  aircraft.  Although  these  investigations  provided  useful  Information  as  to  visual 
performance  during  fixed  wing  flight,  data  obtained  during  this  work  cannot  easily  be  generalised  to 
rotary  wing  flight  because  of  the  extreme  aerodynamic  differences  between  airplanes  and  helicopters. 


Sunkes,  et  al.  (18),  Stern  and  Byuua  (19),  Fret ell,  et  al.  (9,  10)  have  recorded  vleual  performance 
In  helicopters  during  selected  visual  flight  rules  (V?R)  flights.  Additionally,  two  reports  (20,  21) 
investigated  a  number  of  maneuvers  utilising  both  the  interview  technique  as  well  as  lr.flight  recordings 
of  visual  performance  of  two  aviators  under  instrument  flight  rules  (IFR.)  conditions.  These  efforts 
have  provided  some  needed  information  as  to  the  frequency,  duration,  and  sequence  of  fixations  during 
hellcoptar  operations. 

Although  these  studies  have  provided  useful  information  for  the  visual  performance  data  base,  much 
investigation  remains  to  be  accomplished  before  •  reliable  visual  performance, 'workload  model  can  be 
established  fov  safe  helicopter  flight.  The  pupose  of  this  report  is  to  rttempt  to  combine  the  visual 
performance  investigations  being  performed  at  the  US  Army  Aeromedical  Research  Laboratory  into  one  mode 
for  predicting  visual  workload. 


THEORY 

Several  measurements  of  visual  performance  derived  from  data  collected  via  the  corneal  reflection 
technique  contribute  to  the  total  relationship  of  visual  workload.  In  simple  terms,  oculomotor  activity 
can  be  divided  into  two  categories:  (1)  movement  of  the  eye  during  which  minimal  information  gathering 
occurs  and,  (2)  fixation,  a  period  of  relatively  no  movement  during  which  information  transfer  le  felt  to 
be  the  greatest  (1).  The  movement  activity  is  defined  as  the  visual  Jink  value  or  the  visual  path 
traveled  from  one  area  of  interest  to  another.  On  the  other  hand,  the  visual  nonmovement  term,  visual 
fixation,  is  defined  as  stationary  eye  movement  within  a  designated  area  for  at  least  100  millisecond.-). 
Other  visual  terms  which  could  be  included  are  the  total  number  of  areas  that  are  concentrated  on  (or 
fixated),  the  length  of  time  of  each  fixation  (or  dwell  time),  and  the  frequency  that  areas  of  interest 
arc  fixated. 

If  one  assumes  that  the  major  input  mode  la  the  fixation  period,  two  possibilities  exist.  Visual 
workload  could  be  a  function  of  the  time  required  for  information  to  be  transferred  during  fixation;  or, 
workload  could  be  related  to  the  frequency  of  visits  to  an  area  of  interest.  Since  from  a  previous 
investigation  (22)  neither  term  was  found  to  adequately  describe  visual  activity  independently,  both 
comprise  this  input  mode  workload  and  should  be  combined.  Thus,  a  formula  utilising  these  two  terms 
would  reflect  the  workload  cost  of  all  areas  that  were  fixated  by  an  operator  during  vehicle  manipulation. 

This  formula  would  appear  as:  CFa  ”  (T/ET+N/EN)/2.  CFa  represents  the  "cost  factor"  of  an  area  of 
interest.  "T"  is  lapse  time  spent  fixated  on  the  area  divided  by  total  time  (ET)  while  "N"  is  the 
frequency  of  fixations  of  the  area  divided,  by  the  total  number  of  fixations  (EN).  If  these  two  values 
are  divided  by  2,  the  CF  is  in  percentage  of  workload.  If  the  CF  values  of  several  areas  of  interest 
lend  themselves  to  being  combined  into  common  zones  of  interest,  the  CF  values  are  simply  sunued  to¬ 
gether  (CJa  +  CFai  +  CF,2  +  CFa3 

Based  on  our  experience,  the  visual  inputs  required  to  manipulate  an  aircraft  can  be  divided  into 
three  broad  categories:  (1)  basic  vehicle  control,  (2)  barrier  avoidance,  and  (3)  navigational  tracking. 
The  first  requirement  takes  precedence  over  the  latter  two.  UrJer  this  category  of  basic  vehicle  control, 
visual  workload  can  be  further  separated  into  three  major  cones  of  conaon  areas  visual  interest.  Again, 
the  highest  priority  cone  contains  visual  cues  which  provide  information  relating  the  basic  vehicle 
stability  about  its  three  major  axes  of  pitch,  yaw,  and  roll. 

The  second  cone  of  common  areas  of  visual  interest  include  the  input  information  which  supports  the 
first  areas  but  provides  for  more  precise  vehicle  control.  Information  such  as  vehicle  speed,  altitude, 
and  rates  of  acceleration  vould  be  provided  from  this  zone. 

The  last  zone  would  be  comprised  of  vehicle  status  information.  These  cues  would  provide  operator 
visual  feedback  os  to  the  operational  condition  of  the  vehicle.  Examples  of  such  types  of  information 
would  be  provided  from  engine  oil  temperature,  fuel  pressure,  or  electrical  gauges.  As  long  as  there 
were  no  malfunction  of  the  vehicle  as  annunciated  b  one  of  these  instruments,  this  zone  of  visual  inputs 
would  have  the  lowest  priority  of  being  monitored. 

To  summarize,  the  CF  theory  provides  a  method  of  combining  numerous  blocks  of  visual  data  to  provide 
a  more  concise  picture  of  input  workload  of  vehicle  operators.  The  CF  value  computed  for  Zone  1  should 
be  an  indicator  of  basic  workload  required  to  perform  the  task  successfully.  Zone  2  will  also  provide 
supportive  data  of  the  basic  workload  based  on  time  available  after  Zone  1  requirements  are  met. 

It  should,  however,  be  quickly  pointed  out  that  maximum  visual  performance  of  an  area  or  zone  of 
areas  could  indicate  high  visual  workload.  On  the  other  hand,  this  same  performance  could  reflect  a 
high  percentage  of  nonworkload  (free  time)  in  which  the  particular  zone  was  fixated  because  it  was 
centrally  located.  This  could  be  demonstrated  by  similar  visual  workload  in  the  central  viewing  field 
of  a  boat  operator  on  a  large  lake  and  a  helicopter  pilot  during  nap-of-the-earch  maneuvers.  However, 
by  establishing  conditions  which  provide  measurements  of  the  baseline  for  the  maximum  time  utilized  and 
the  minimum  time  required  to  maintain  vehicle  stability,  these  "free  time"  periods  can  be  estimated.  An 
example  of  this  can  be  reviewed  in  USAARL  Report  No.  78-6,  Visual  Perfortuance/Workload  of  Helicopter 
Pilots  During  Instrument  Flight  (22). 

The  remainder  of  this  report  all  deal  with  the  data  base  which  the  US  Army  Aeromedical  Research 
Laboratory  has  obtained  during  helicopter  and  fixed  wing  maneuvers  in  attempting  to  establish  a  visual 
workload  model.  These  types  of  data  not  only  provide  the  needed  Information  to  test  the  CF  theory,  but 
also  will  provide  Information  to  improve  and  refine  the  theory  to  provide  operational  answers  for  safer 
military  airborne  operations. 

APPLICATION 

Initially,  a  study  was  designed  to  investigste  the  visual  performance  of  helicopter  pilots  during  actual 
flights  under  instrument  flight  conditions  (IFR)  (22).  This  study  was  unique  because  the  aviators  were 


forced  by  the  taat  conditions  Co  receive  any  and  all  vlaual  cuoa  to  manipulate  tha  aircraft  f roe  tha 
instrument  panal.  This  llaltad  vlaual  flald  allovad  lnvaatlgatora  to  analyta  which  cuaa  wara  fixated  and 
darlva  what  Information  waa  required  by  tha  pllota.  During  VFF  thla  extraction  of  vlaual  parformanca  data 
would  ba  vary  difficult  bacauaa  of  tha  lack  of  praclaa  definition*  aa  to  the  quality  of  poaalbla  VFfc  cuaa. 

Vlaual  parformanca  via  tha  cornaal  reflection  technique  waa  collected  from  two  groupa  of  subject 
pllota.  Subject  groupa  wara  categorised  on  tha  baale  of  flight  experience,  with  one  group  having  over 
2,000  more  flight  houra  than  tha  other.  All  auhjecta  flaw  the  same  lnatruaent  flight  profile  comprised 
of  eight  baale  maneuvara.  The  raaulta  of  tha  study  are  summarised  by  Figure  2,  IQS  Identifies  the  pilots 
with  the  moat  flight  experienced  while  SQA  represents  tha  low  time  pllota.  2} ,  and  Z*  doslgnato  tha 
three  sonen  of  Instruments  following  tha  pravioualy  discussed  method  of  claaalfication.  Table  1  la  tha 
Hating  of  those  Instruments  comprising  each  sons. 

Since  Zona  1  la  tha  most  critical  Indicator  of  vlaual  worMoad,  the  data  reflect  that  tha  experienced 
pllota  had  more  workload  to  complete  tha  miaalor  than  did  the  leas  experienced  pilots.  Thla  could  be 
further  Interpreted  to  mean  that:  (1)  the  IQA  could  spend  leas  time  In  thla  flight  environment  before 
becoming  fatigued,  or  (2)  tha  IQA  would  most  likaly  make  more  flight  errors  sooner  than  the  SQA  pllota. 

These  raaulta  appear  to  contradict  tha  common  philosophy  that  experienced  pilots  should  have  been  the 
batter  combat-prepared  pilots.  Therefore,  tha  data  were  re-examined  more  cloaaly  for  other  possible 
explanations.  In  attempting  to  establish  other  group  differences.  It  waa  concluded  that  although  the  IQA 
group  did  have  the  most  total  flight  time,  they  ware  all  currently  holding  Job  positions  aa  Instrument 
flight  Instructors.  For  this  reason  they.  In  fact,  had  leaa  current  "hands-on"  experience  than  tha  SQA 
group  who  were  all  recant  graduates  of  flight  school  and  therefore  had  just  completed  a  vary  concentrated 
block  of  "hands-on"  flight  experience. 

To  further  test  this  line  of  thought,  a  tingle  subject  was  selected  who  currently  had  2,500  hours  of 
flight  experience  but  who  had  not  flown  for  tha  past  three  years  (23).  His  Initial  flight  test  results 
(NQA)  are  reflected  by  Figure  3.  The  results  Indicate  a  significantly  higher  level  for  his  visual  work¬ 
load  In  Zona  1  to  perform  the  same  mission  aa  the  previous  subjects.  Thla  subject  was  then  given  14  houra 
of  refresher  training  by  the  laboratory's  Instructor  pilot.  Figure  a  presents  the  results  of  his  l»st 
flight  (HQA)  on  the  same  profile  compared  again  to  tha  initial  SQA  subject  group.  It  la  apparent  that 
his  workload  to  perform  the  mission  has  been  reduced  to  a  similar  level  as  that  of  the  SQA  group.  These 
results  would  seam  to  indicate  that  utilising  the  CF  method  of  calculating  visual  workload  aided  In 
identifying  differing  visual  woikload  as  a  function  of  aviator's  current  proficiency. 

This  same  method  was  again  accessed  during  a  second  investigation  which  compared  the  vlaual  workload 
associate:,  with  flight  of  a  fixed  wing  aircraft,  during  Instrument  conditions,  compared  to  the  original 
rotary  wing  Instrument  flights  (24).  AU-21  fixed  wing  aircraft  was  flown  over  the  seme  flight  profile 

aa  in  tha  helicopter  Instrument  flight.  Two  subject  groups  ware  again  utilised.  However,  for  thla 
Investigation,  the  flrac  group  were  current  instructor  pilots  (ICA)  which  compared  to  the  IQA  group  in 
tha  helicopter  report.  The  second  group  ccnaiated  of  noncurrent  U-21  pllota. (NCA)  who  had  not  flown  the 
U-21  for  at  least  3  years  prior  to  tha  teat  flight.  The  purpose  of  this  investigation  was  twofold  lu 
that  It  allovud  a  comparison  of  visual  workload  as  a  function  of  vehicle  stability  (l.e.,  rotary  wing 
versus  fixed  wing  aircraft)  while  further  testing  the  currency  versus  experience  question. 

Figure  5  represents  comparison  of  the  two  U-21  subject  groupa.  The  results  Indicate,  as  have  past 
findings,  that  the  noncurrent  aviators  (NCA)  experienced  more  vlaual  workload  than  did  the  current  aviators 
(ICA).  However,  a  confounding  variable  vas  that  the  NCA  subjects  were  all  current  In  the  UH-1  helicopters. 
Because  of  this  variable,  the  level  of  difference  between  subject  groups  Is  perhaps  not  as  significant  as 
would  oe  anticipated.  Nevertheleaa,  the  CF  visual  workload  theory  was  affectively  utilised  to  Indicate 
the  vlaual  workload  associated  with  aviator  current  proficiency. 

Thw  ICA  subjects  of  the  U-21  study  were  then  compared  to  the  IQA  aviators  from  the  helicopter  Instru¬ 
ment  report.  A  representation  of  this  ccaparison  Is  referenced  by  Figure  6.  Again,  If  the  CF  theory  la 
an  Indication  of  coat  or  workload  associated  with  the  manipulation  of  a  vehicle,  the  results  would  demon¬ 
strate  that  tha  UH-1H  helicopter  requires  more  visual  workload  on  the  primary  Zone  1  Instruments  than  did 
tha  U-21  fixed  wing  aircraft.  These  findings  would  be  predicted  by  subjective  data  and  the  relative 
ratings  of  the  stability  of  the  visual  data  and  the  relative  ratings  of  the  stability  of  the  two  vehicles 
by  other  test  agencies.  However,  the  Implications  of  the  visual  data  are  that  lr  tha  helicopter  stability 
wan  Improved,  than  tha  crew  could  remain  on  station  or  in  combat  longer  before  becoming  fatlguad. 

This  same  method  of  teatiug  could  be  lmplmaented  to  test  future  generation  of  helicopters  to  determine 
relative  stability.  If  such  aircraft  did  Impose  less  visual  input  work  to  manipulate,  they  would  provide 
a  better  platform  for  combat  utilisation, 

To  further  expand  the  line  of  thought  that  the  CF  theory  could  reflect  in  soma  part  the  visual  workload 
associated  with  the  stability  of  the  vehicle,  a  study  has  been  completed.  This  investigation  compared  two 
groups  of  subjects  with  qualifications  similar  to  the  original  SQA  and  IQA  groups  of  the  helicopter  study 
(25).  Tha  two  groups  wara;  however,  tested  In  an  UH-1  flight  simulator  which  was  developed  for  the  US 
Army  to  duplicate  tha  flight,  angina,  and  system  characteristics  of  tha  UH-1  helicopter. 

The  results  are  summarised  In  Figure  7 .  The  conclusions  that  can  be  drawn  from  these  results  are 
that  the  UH-1  simulator  does,  is  general,  have  the  same  visual  workload  pattern  as  the  UH-1  helicopter. 
However,  becausa  the  visual  workload  In  Zone  1  is  higher  for  the  simulator,  the  vehicle  is  less  stable 
than  the  UH-1  helicopter.  An  expansion  of  the  CF  of  Zone  1  can  be  assn  In  Figure  8.  The  three  instru¬ 
ments  that  comprise  this  ions  are  Indicated  by  AH  for  the  artiflcel  horizon,  BMI  for  radio  magnetic 
compass,  and  T-B  for  turn  end  bank  Indicator.  From  the  major  difference  of  the  two  vehicles  as  aaet.  on 
the  workload  of  tha  BMI,  the  Indication  would  be  that  the  UH-1  simulator  la  leas  stable  mainly  on  the 
yaw  axis.  In  addition,  tha  intar-group  aubjact  differences  In  tha  simulator  reflected  the  same  results 
as  had  been  reported  in  the  UH-1  helicopter  study. 
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CONCLUSIONS 

In  auamary ,  thia  paper  has  attempted  to  addrera  a  method  of  assessing  workload  requirements  lapoaad 
on  ona  of  flva  poaalbla  oparator  aanaory  channala.  Since  the  daaaad  of  thia  vlaual  Input  channal  ta 
estimated,  aa  previously  atatod,  to  be  90S  of  tha  total  input  dianda  for  vahlcla  manipulation,  any  theory 
which  allows  oven  a  partial  but  preclra  deacription  of  tha  workload  could  aid  future  hardware  deslgno, 
training,  and  mission  delineation.  Theaa  data  will  further  be  uaeful  in  determining  an  approach  to  reduce 
operator  fatigue  in  the  flight  environment. 

The  current  CF  theory,  although  not  the  final  answer,  allows  a  more  concise  picture  of  vlaual  work¬ 
load  than  tha  classical  methods  which  normally  consist  of  the  permutation  of  seemingly  unrelated  visual 
data  points.  The  application  section  of  thia  report  demonstrated  how  the  US  Army  Aeromedlcal  Research 
Laboratory  haa  collected  and  is  continuing  to  expand  a  data  baae  describing  pilot  visual  performance  in 
the  military  environment.  Such  data  are  considered  Invaluable  to  expand  and  test  the  current  CF  theory 
aa  well  as  providing  an  objective  mathod  to  be  utilised  in  answering  current  operational  questions  and 
problems.  The  examples  were  brief  descriptions  of  studies  which  ere  already  published  in  their  entirety 
or  are  in  the  process  of  being  completed.  The  implications  from  the  results  suggest  that  the  CF  theory 
la  a  valuable  tool  in  testing  and  determining  what  the  visual  workload  level  should  be  for  combat  profi¬ 
cient  pilots,  how  long  pilots  with  varying  dagresa  of  proficiency  could  be  expected  to  fly  in  the  combat 
environment,  and  aircraft  design  requirements  (such  as  stability),  to  reduce  the  onset  of  fatigue-induced 
errors.  Additionally,  tha  CF  theory  can  be  utilised  to  teat  and  determine  varying  mission  related  work¬ 
load,  as  well  as  tha  workload  required  by  special  equipment  such  aa  the  night  vision  goggles,  navigation 
equipment,  and  experimental  flight  displays. 

The  ability  to  measure  visual  input  workload  and/or  paychomotor  control  la  recognized  as  an  invaluable 
tool  required  to  validate  Instrument  panel  design,  develop  training  and  proficiency  requirements  and,  in 
general,  provide  a  more  effective  helicopter  system  for  mission  accomplishment. 


TABLE  1 

INSTRUMENT  CLUSTERS  WITHIN  EACH  ZONE 


ZONE 

I 

1. 

Attitude  Indicator 

AH 

2. 

Radio  Magnetic  Compass 

RMI 

3. 

Turn  and  Slip  Indicator 

T&B 

ZONE 

II 

1. 

Altimeter 

ALT 

2. 

Airspeed  Indicator 

AS 

3. 

Vertical  Velocity  Indicator 

VSI 

ZONE 

III 

1. 

Aircraft  Monitoring  Gauges 

TORQ,  RPM,  ELEC, 
OIL  >  FUEL 

2. 

Special  Navlgatiou  Instru¬ 
mentation 

OBS 

3. 

All  Other  Visual  Areas 

REST 
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Figure  1.  Visual  Recording  Equipment 


Figure  2.  Graph  of  CF/Zone 
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Figure  3.  Graph  of  CF/Zoue 


Figure  4.  Graph  of  CF/Zone 


_  iwwtiwww'*"^ 


Figure  5.  Graph  of  CF/ Zone — Fixed  Wing  Aircraft 
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Figure  t>.  Zone/CF  for  IZ.Q/ Aircraft 
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HANDLING  QUALITIES  ,  WORKLOAD  AND  HEART  RATE 
by 

Alan  H.  Roacoe 
Royal  Aircraft  Establishment 
Bedford  England 


Introduction 


The  Important  and  close  relationship  between  aircraft  handling  qualities  and  pilot  workload  has  bee*' 
underlined  ■  ■ several  authors.  Twelve  years  ago  Westbrook  and  his  colleagues  (1)  stated:  "To  a  pilot 
the  multiple  stresses  of  flight,  his  workload,  are  suranerized  under  his  Judgement  of  the  handling 
qualities."  Today,  there  in  an  increasing  tendency  for  the  pilot  to  be  less  of  an  active  controller  and 
more  of  a  supervisor,  but  even  so,  this  statement  -  especially  when  applied  to  short  term  workload  - 
still  holds  true.  This  changing  role  of  the  pilot  has  led  to  a  wider  interpretation  of  the  term  handling 
qualities.  Cooper  (2)  remarked:  "Because  of  preoccupation  with  manual  control  in  the  past,  it  has  been 
a  general  practice  to  associate  handling  qualities  primarily  with  aircraft  stability  and  control 
characteristics.  Actually,,  handling  qualities  encompasses  not  only  the  aircraft  stability  and  control 

but  the  total  of  the  pilot-aircraft  interface  features  as  well  . "  Most  people  now  interpret  the 

term  handling  qualities  in  this  way  and  it  is  convenient  to  do  so  in  this  paper. 

Unfortunately,  no  such  agreement  exists  about  the  interpretation  of  the  term  workload.  It  is, 
therefore,  important  for  authors  to  make  clear  t.iei'  own  interpretation  of  the  term.  In  this  paper,  the 
basic  idea  of  pilot  workload  is  considered  to  be  effort-related,  as  distinct  from  task-or  performance- 
related  concepts.  A  suitable  definition  is  that  given  by  Cooper  and  Harter  (3):  "the  integrated  physical 
and  mental  effort  required  to  perform  a  specified  piloting  task."  The  idea  of  workload  as  effort  is  one 
with  which  most  pilots  would  agree  (4);  and  it  is  consistent  with  the  measurement  of  heart  rate  as  a 
means  of  assessing  workload. 

Assessing  handling  qualities  and  the  associated  workload  is  an  important  part  of  flight  evaluation, 
whether  of  control  and  stability  or  of  guidance  systems,  and  various  assessment  methods  are  used  by  test 
pilots  and  engineers  for  this  purpose.  Measurement  of  performance,  which  is  an  important  and  essential 
part  of  control  and  guidance  evaluation,  may  be  used  to  estimate  changes  in  both  handling  qualities  and 
workload  (5)  (6) .  Unfortunately,  changes  in  handling  and  workload  are  not  always  refected  by  changes  in 
performance.  In  1956,  Duddy  (7)  highlighted  the  difficulty  of  estimating  the  extent  of  improved  stability 
in  a  directionally  unstable  fighter  when  fitted  with  a  yaw  damper  as  aiming  accuracy  was  not  improved. 

As  Spyker  et.  al.  (8)  have  observed:  "An  evaluation  procedure  which  relies  exclusively  on  performaice 
measures  is  inadequate.  That  is,  a  pilot  with  one  configuration  may  work  twice  as  hard  as  he  does  with 
another,  yet  achieve  equal  performance  with  both."  This  ability  of  pilots  to  "compensate"  is  referred 
to  by  Cooper  and  Harper  (3)  in  their  Handling  Qualities  Rating  Scale. 

Another  method  of  assessing  handling  qualities  and  levels  of  workload,  especially  during  landing 
approaches,  is  by  measuring  control  activity.  Morrison  and  Stlmely  (9)  quantified  pitch  activity  and 
used  the  results  to  augment  pilot's  subjective  impressions  of  workload  during  noise  abatement  approaches. 
Barber  et.  al.  (10)  suamai<>d  force  inputs  from  elevator,  aileron,  and  rudder  to  give  a  workload  factor 
during  the  evaluation  of  genets,  aviation  aircraft  handling  qualities.  Nevertheless,  these  authors 
accepted  that  using  force  inputs  ».  ■  ~ive  a  workload  factor  ”.  .  .  has  some  deflclences." 

Objective  techniques,  especially  if  they  involve  precise  measurement,  are  particularly  attractive 
to  engineers.  However,  by  far  the  most  used  techniques  for  evaluating  handling  qualities  and  workload 
are  subjective.  These  techniques,  which  vary  from  simple  comments  by  pilots  to  complicated  -uestlonnalres 
and  rating  scales  have,  for  the  most  part,  been  developed  for  assessing  aircraft  handling  rather  than 
pilot  workload.  A  well  known  aud  accepted  handling  qualities  rating  scale  is  that  of  Cooper  and  Harper 
(3),  which  refers  to  workload  by  asking  the  question:  "Is  adequate  performance  attainable  from  tolerable 
workload?" 

Clearly,  workload  levels  for  a  given  task  are  related  to  the  aircraft's  handling  characteristics, 
but  a  valid  rating  for  the  latter  may  not  always  give  o  reliable  estimate  of  workload.  Experienced  test 
pilots  may  be  quite  adept  at  using  opinion  rating  scales  but  occasionally  it  seems  difficult  to  separate 
assessments  of  workload  from  those  of  handling  qualities,  leading  to  anomalies  and  ambiguities.  Westbrook 
aud  his  colleagues  (1)  commented  that:  "If  a  reliable  method  were  available  to  obtain  a  measure  of 
workload  or  stress,  it  is  undeniably  true  that  many  of  the  lnomalles  in  handling  qualities  data  could  be 
explained." 

Several  investigators  have  recorded  physiological  variable  from  pilots  in  real  and  simulated  flight 
as  a  means  of  estimating  levels  of  stress  and  workload.  This  paper,  by  describing  two  current  flight 
trials  and  by  referring  briefly  to  previous  studies,  examines  the  relationship  between  pilot's  heart  rate 
and  subjective  assessments  of  handling  qualities  and  workload. 

Materials  and  Methods 


All  the  subjects  referred  to  in  the  following  examples  were  qualified  test  pilots  who  were  experi¬ 
enced  and  current  on  aircraft  type.  Most  of  the  flight  trials  involved  either  the  take-off  or  the 
approach  and  landing  and  so  the  task  was  well  defined  and  realistically  demanding.  Performance  was 
closely  monitored  by  on-board  instrumentation  and  by  airfield  sited  klnetheodolltes.  Whenever  possible 
flight  trials  were  designed  in  such  a  way  that  experimental  variables  could  be  compared  during  the  same 
sortie.  In  this  way  the  effects  of  weather,  learning  and  other  irrelevant  influences  were  minimised. 

Various  aircraft,  ranging  from  pure  research  to  representative  civil  and  military  types,  were  used 
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Pilot  assessments  of  handling  qualities  and  workload  were  made  by  ualng  the  Cooper-Harper  scale, 
by  atraight-forvard  consents,  or  by  a  questionnaire  designed  for  a  particular  trial.  In  Dost  cases  the 
pilot  recorded  his  comments  or  gave  a  rating  while  In  the  aircraft;  questionnaires  were  coapleted  eftc 
landing.  Latterly,  a  formal  workload  rating  scale,  based  on  the  Cooper-Harper  handling  scale,  has  beer 
constructed  and  ic  currently  being  evaluated. 

At  Bedford,  pilot's  heart  rates  are  obtained  by  recording  the  ECG  signal  in  analogue  fora  with  the 
"R"  wave  being  used  to  trigger  a  cardio tachometer.  The  resulting  beat-to-beat  rate  la  then  plotted 
against  tine  for  initial  examination  and  analysis  (Fig.  1).  Subsequently,  mean  heart  rates  for 
consecutive  30  sec  epochs  and  mean  values  for  a  particular  flight  phase  or  sub-phase  are  used  to  compare 
levels  of  workload. 

Flight  Trials 

1.  "Ski-jump"  Ramp  take-offs. 

The  first  example  Is  of  a  trial  to  assess  the  advantages  of  using  an  inclined  ramp  to  Improve  the 
take-off  performance  of  ahlpborne  H  S  Harrier  VTOL  combat  aircraft  (11).  The  eircraft  is  accelerated  on 
to  the  akl-juap  shaped  ramp  from  a  short  run  (50-100m)  with  nozzles  rotated  rearwards .  At  the  top  of 
the  ramp,  and  on  the  point  of  becoming  airborne,  the  nozzles  are  rotated  downwards  to  a  pre-set  angle. 
After  reaching  conventional  flying  speed  the  nozzles  are  rotated  back  to  the  aft  position. 

One  of  the  Harriers  used  In  the  trial,  a  two-seat  version  (T3) ,  is  equipped  for  telemetering  heart 
rate  from  both  cockpits.  A  portable  EEC  recorder  (Oxford  Instruments)  la  used  to  monitor  pilot's  heart 
rate  from  the  other  (single  seat)  aircraft. 

Two  or  three  ramp  take-offs  are  normally  carried  out  on  each  sortie  with  the  take-off  weight  and 
distance  being  varied  between  runs.  The  reap  was  set  at  an  angle  of  6°  for  the  first  series  of  take-offs 
and  at  9°  for  the  second  series.  Incremental  increases  in  ramp  angle  are  planned  for  future  stages  of 
the  trial. 

Performance  measurements  are  essential  In  this  trial  but  assessments  of  handling  and  workload, 
especially  for  the  critical  period  of  accelerating  along  the  ramp  during  the  partially  Jet-borne  flight 
phase,  are  also  Important.  Pilot’s  heart  rates  together  with  subjective  ratings  of  handling  qualities, 
using  the  Cooper-Harper  scale,  and  of  workload,  ualng  a  special  ten-point  rating  scale  have  been  obtained 
for  a  large  number  of  take-offs. 

Three  teat  pilots,  flying  the  T4,  proved  the  bulk  of  the  heart  rate  data  and  subjective  ratings. 
Take-offs  with  and  without  autoatablllsatlon,  during  simulated  and  real  night  conditions,  and  in 
croaswlnds  up  to  10K  were  evaluated. 

Table  1  gives  overall  pilot  ratings,  and  mean  heart  rates  for  40  sec  epochs  centred  on  the  time  of 
nozzle  rotation.  This  epoch  Includes  a  period  of  from  10  to  13  sec  before  releasing  the  brakes,  a  time 
when  pilots  carry  out  a  final  check  of  instruments  and  configuration. 

TABLE  1 

6°  HARRIER  RAMP  TAKE-OFFS  ("NORMAL"  CONDITIONS) .  KEAN  HEART  RATES  (40a) , 

HANDLING  RATINGS  (COOPER-HARPER)  AND  WORKLOAD  RATINGS. 


Pilot 

n  * 

Heart  Rate  bpm 

HQ  Rating 

WL  Rating 

A 

15 

110.9 

4 

4 

B 

11 

119.3 

4 

4 

C 

5 

93.6 

3 

4 

Fig.  1  la  a  typical  beat-to-bea*  heart  rate  plot  for  the  handling  pilot  during  a  ramp  take-off  flown 
from  the  front  seat.  Close  examination  shows  that  his  heart  rate  Increased  some  10-20  sec  before  going 
to  take-off  power  prior  to  releasing  the  brakes.  Pilot  comments  confirm  that  the  workload  Increases 
rapidly  at  this  time  and  remainds  high  until  conventional  flight  some  30  to  40  sec  later.  Overall 
assessment  of  handling  qualities  and  workload,  after  the  6°  stage  of  the  trial,  were  favorable  and  ramp 

launches  were  considered  to  be  easier  than  normal  runway  short  take-offs  (STOs). 

Comparison  of  mean  60s  heart  rate  shows  no  difference  between  types  of  take-off  If  the  epoch  Includes 
the  15-20s  before  rolling.  However,  If  the  epoch  starts  when  brakes  are  released  the  runway  STO  heart 

rates  tend  to  be  2-3  bpm  higher  (Table  II) .  The  finding  agrees  with  ratings  for  handling  qualities  and 

for  workload.  The  Influence  of  ground  effect,  during  runway  STOs,  seems  to  cause  a  deterioration  in 
handling  with  a  consequent  Increase  in  workload. 

Neither  night  take-offs,  both  simulated  and  real,  nor  crosswinda  op  to  10K  caused  any  difficulty  and 
resulting  heart  rate  values  and  ratings  were  similar  to  those  for  'normal'  ramp  launches.  Take-offs  in 
the  unstgblllzed  mode  tended  to  result  In  higher  ratings  with  marginally  Increased  heart  rates.  Results 
o£  the  9  ramp  evaluation  showed  ratings  of  handling  qualities  and  workload  to  be  similar  to  those  for 
6  .  Heart  rates,  which  were  slightly  lower,  agreed  wit.Jj  pilot  opinion  that  9°rarap  take-offs  were  no 
more  difficult  and  could  well  be  easier  than  those  at  6  .  Pilots  commented  on  a  smoother  ride  along 
the  9°  ramp. 


FIGURE  1.  HS  HARRIER.  BEAT-TO-BEAT  HEART  RATE  AND  NOZZLE  ANGLE  RAMP  TAKE-OFF 

TABLE  II 

COMPARISON  OF  HARRIER  STO  MEAN  HEART  RATES  (60s) 


6°  Ramp 

Runway 

Pilot 

1  2 

1 

2 

A 

110.8  108.9 

109.3 

110.8 

(n  -  15) 

(n  -  21) 

B 

116.5  114.8 

115.9 

117.3 

(n  -  11) 

(n  -  6) 

C 

90.4  89.6 

90.8 

91.2 

(n  -  5) 

(n  -  18) 

1  Epoch  from  IS  -  20s  before  releasing  brakes. 


2  Epoch  from  releasing  brakes. 

2.  Direct  Lift  Control. 

A  modified  BAC  1-11  Is  currently  being  used  to  evaluate  the  benefits  of  using  Direct  Lift  Control 
(DLC)  to  Improve  handling  and  performance  during  the  approach  and  lending.  DLC  should  enable  the  aircraft 
to  be  flown  more  precisely  on  the  glide  slope  and  also  result  in  better  all  round  landing  performance 
with  less  touchdown  scatter.  Workload  during  this  phate  should  be  reduced  and  the  ability  to  cope  with 
turbulence  improved.  More  direct  control  of  descent  rave  should  prove  beneficial  during  steep  gradient 
approaches  and  improve  safety  during  the  flare. 

In  addition  to  monitoring  aircraft  performance,  pilot  assessment  of  '.tend  ling  and  workload ,  and 
measurement  of  pilot's  heart  rates,  are  important  trial  requirements. 

The  DLC  system  fitted  to  the  1-11  uses  the  four  wing  spoilers  to  generate  lift  changes;  these  are 
controlled  by  electrical  sensing  of  control  column  pitch  movements.  A  wash-out  circuit  is  inserted  into 
the  system  to  provide  the  pilot  with  relatively  normal  acceleration  responses  to  pitch  inputs. 

The  first  batch  of  flying  (phases  one  and  two)  was  concerned  mainly  with  optimising  the  control 
characteristics  and  giving  the  pile'-  <oae  experience  of  new  handling  techniques.  Control  was  rated 
batter  after  the  DLC  waa  ms  ’Ines  '-attar  (till  after  a  lag  waa  incorported  in  '.he  DLC  signal  from 
the  control  column.  Pilot  >  monitored  on  only  a  few  sorties  during  this  stage  of  the 

trial. 
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Th*  main  flight  Investigation  (phase  three)  waa  aimed  at  evaluating  DLC  during  both  conventional  3° 
approaches  and  6°  steep  gradient  approaches.  Each  sortie  Included  a  batch  of  basic-aircraft  runs  for 
comparison  with  DLC. 

For  most  sorties  pilots  were  briefed  to  concentrate  on  e  precise  position  at  50  ft  and  glide  slope 
tracking  was  not  appreciably  better.  But  on  the  occaelona  when  pilots  vere  briefed  to  maintain  precise 
glide  slope  trecklng,  performance  Improved.  This  waa  particularly  evident  on  the  6°  approach  gradient. 
Landing  performance  waa  definitely  better  when  off  ateep  approaches  but  not  noticeably  different  when 
from  3°  approaches.  The  ability  of  DLC  to  quickly  arrest  descent  helped  to  produce  store  accurate  and 
smoother  lendings,  but  If  the  flare  was  started  too  early  there  was  a  tendency  for  pilot  Induced  oscil¬ 
lations  (PIOs)  to  occur. 

Overall  pilot  assessments  indicated  that  the  aircraft's  handling  qualities  in  the  landing  configura¬ 
tion  were  improved  with  DLC.  Workload  during  the  approach  and  in  the  flare  was  thought  to  be  generally 
lower,  but  especially  so  fer  the  6  glide  slope.  Heart  rate  responses  appeared  to  agree  with  pilot 
opinion  although,  at  first  there  were  a  wnall  number  of  discrepancies.  These  were  resolved  after  discus¬ 
sion  with  the  pilots.  For  example,  when  PIOs  occurred  heart  rate  responses  for  the  flare  epoch  were  much 
higher;  the  overall  effect  was  to  result  in  mean  values  for  this  manoeuvre  which  were  similar  whether  DLC 
was  used  or  not. 

Perhaps  it  is  not  surprising  that  pilot  ratings  for  the  flare,  both  of  handling  and  of  workload, 
varied  considerably  according  to  whether  PIOs  were  present  or  notl 

Fig.  2  shows  mean  heart  rate  values  for  3°  and  6°  approaches,  with  and  without  DLC,  flown  In  similar 
weather  conditions  by  one  of  the  three  project  test  pilots.  These  results  show  an  obvious  trend  In  favor 
of  DLC  but  the  only  appreciable  reduction  In  heart  rate  Is  during  the  glide  slope  Interception  and  early 
part  of  the  6°  approach. 
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FIGURE  2.  BAC  1-11.  MEAN  30s  HEART  RATE  VALUES  FOR  DLC  AND  BASIC- AIRCRAFT 

Results  for  one  of  the  other  two  pilots  were  similar  to  those  illustrated,  again  showing  a  definite 
trend  In  favor  of  DLC,  Mean  heart  rate  values  for  the  third  pilot  did  not  differ  appreciably  between  DLC 
and  basic  aircraft  approaches.  In  fact,  because  PIOs  seemed  to  disturb  this  pilot  more  mean  rate  for  the 
flare  from  6  approaches  was  slightly  higher  with  DLg.  He  assessed  handling  qualities  overall  as  being 
Improved  with  DLC,  he  was  unsure  about  workload  on  3  slopes  but  felt  It  was  reduced  on  6  approaches. 

Results  from  subsequent  flignt  (phase  four),  following  minor  changes  to  the  system,  have  confirmed 
the  benefits  of  DLC.  However,  because  pilots  were  briefed  to  fly  more  accurate  glide  slopes,  heart  rate 
values  were  not  noticeably  reduced,  but  performance  was  improved.  A  sortie  of  3°  and  6°  approaches  flown 
in  turbulence  provided  the  opportunity  to  demonstrate  the  aavantages  of  DLC.  This  was  confirmed  by  the 
markedly  lower  heart  rates  for  DLC  approaches  when  compared  vjith  basic  -  aircraft  approaches. 

It  was  hoped  to  carry  out  sufficient  flying  to  allow  statistical  analysis  of  results,  but  the  number 
of  sorties  has  been  limited  and  only  trends  were  established. 

Comments 


The  two  trials  described  above  are  both  typical  examples  where  assessment  of  handling  qualities  and 
related  workload  are  Important  features.  However,  they  differ  in  some  respects.  The  concept  of  the 
'ski-jump'  ramp  Is  aimed  at  Increasing  the  maximum  take-off  weight  of  ship-borne  vectored  thrust  combat 
aircraft  thereby  Improving  their  overall  tactical  performance.  This  being  the  primary  objective. 
Handling  qualities  and  workload  are  of  secondary  importance;  it  is  only  necessary  to  ensure  they  are  not 
increased  beyond  a  level  which  might  Jeopardise  the  take-off. 
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DLC,  on  the  other  hend,  Is  almwd  primarily  at  improving  safety  during  the  approach  and  landing. 
Therefore,  assessment  of  handling  and  workload  assumes  much  greater  Importance.  Of  course,  performance, 
because  of  Its  relationship  to  safety,  is  also  important. 

In  both  examples  there  is  generally  good  agreement  between  handling  qualities,  workload,  and  heart 
rate.  The  few  anomalies  that  have  occurred,  especially  in  the  early  stages  of  the  trials,  have  been 
resolved  by  detailed  discussion  with  the  pilots.  For  example,  a  high  heart  rata  and  high  workload  rating 
but  low  rating  for  handling  qualities  during  the  first  ramp  taka-off  by  oue  test  pilot  was  due  to 
excessive  anticipation.  The  pilot  rated  the  workload  as  7  (out  of  10)  and  the  handling  as  Cooper-Harper 
3.  His  40  second  mean  heart  rate  was  156  bpm.  Afterwards  he  reported:  "with  the  benefit  of  hindsight, 

I  realize  that  I  was  much  more  keyed  up  than  I  need  have  been,  and  I  expect  that  m>  worklord  will  be  very 
much  less  on  subsequent  launches  as  I  gain  experience:  His  subsequent  workload  ratings  averaged  4  with 
autostabs  on  and  5  with  autostabs  off  and  the  corresponding  handling  assessments  were,  similarly,  Cooper- 
Harper  4  and  5.  The  overall  mean  heart  rate  level  for  eleven  6°  take-offs  was  119  bpm.  The  high  heart 
rate  generated  by  the  first  ramp  take-off  is  typical  of  the  Increased  arousal  experienced  by  test  pilots 
about  to  carry  out  a  novel  flight  task.  Roscoe  (12)  has  suggested  that  experimental  test  pilots  frequently 
overestimate  the  level  of  difficulty  for  the  first  run  of  an  untried  or  unusual  manoeuvre  or  task. 

Occasional  heart  rate  measurements  of  two  pilots  during  the  preliminary  phases  of  the  DLC  trial 
resulted  in  appreciably  lower  values  for  the  new  system  compared  with  the  basic-aircraft.  These  pilots 
weie  also  most  enthusiastic  in  their  early  comments  on  the  system.  It  was,  therefore,  something  of  a 
disappointment  to  find  smaller  decreases  in  heart  rate  when  it  was  routinely  monitored  during  phase  three. 
Subsequent  discussion  revealed  that  in  flight  trial  proper,  pilots  had  changed  their  strategy  and  flew 
more  precisely  than  ir.  the  previous  stage.  Improvements  in  handling  were  apparently  being  used  to  Increase 
performance  although  this  was  not  always  measurable.  Occasional  discrepancies  between  heart  rate  end 
pilot  opinion  were  caused  by  failure  to  record  the  fact  that  PIOs  occurred  during  the  flare. 

Previous  Studies 


For  some  nine  years,  at  RAE  Bedford,  pilot's  heart  rates  have  been  monitored  during  various  flight 
trials  as  part  of  a  long  term  study  of  workload  and  stress.  Evaluation  of  handling  qualities  was  a 
primary  requirement  of  many  of  these  trials  sad  it  is  Interesting,  and  perhaps  profitable,  to  refer 
briefly  to  some  earlier  ones. 

Autoatabllaatlon  systems  should  lessen  the  effects  of  poor  stability  and  control  and  thus  lead  to 
an  overall  improvement  in  handling.  The  VTOL  research  Short  SCI  was  an  example  of. a  relatively  unstable 
aircraft  and  pilots  comparing  the  stabilised  with  the  unstablllsed  configuration  invariably  commented  on 
the  marked  improvement  in  handling  qualities  of  the  former.  It  is  interesting  to  compare  heart  rate 
responses  for  one  pilot  flying  two  similar  6  min  sorties  consisting  of  a  vortical  take-off  to  30m  (100  ft), 
small  accelerations  and  decelerations,  ending  lit  a  vertical  landing.  The  first  flight  was  stabilised 
and  resulted  in  a  mean  heart  rate  of  109.6  bps.;  the  second  flight,  which  wes  unstablllsed,  resulted  in 
autostablllaatlon,  but  similar  heart  rate  comparisons  for  other  test  pilots  did  r.ot  show  any  differences. 
Detailed  discussion  with  pilots  revealed  that  mout  of  them  suspected  the  integrity  of  the  autostablllsatlon 
system  aud  transferred  the  spare  effort  made  available  by  improved  handling  to  monitoring  the  system  Itself. 

Even  though  handling  qualities  and  workload  ere  closely  related,  it  does  not  necessarily  follow 
that  an  improvement  in  handling  will  invariably  lead  to  a  reduction  in  workload.  It  may  sometimes  be 
pevferable,  especially  for  well  motivated  pilots,  to  improve  oerformance  and  maintain  the  same  level  of 
workload.  If  performance  is  monitored  such  Improvement  will  be  obvious  and  this  alone  will  indicate  the 
defree  of  benefit  gained  by  .mproved  handling.  Such  Improvement  is  evident  in  some  of  the  DLC  flying 
referred  to  earlier.  Pilo's  also  make  use  of  additional  spare  effort  or  capacity  to  increase  monitoring 
or  to  carry  out  other  covert  tasks  which  are  not  immediately  obvious. 

Gerathewohl  (13)  made  the  point  that  subjective  ratings  of  handling  qualities  "....  as  accurate  as 
they  may  be  in  regard  to  control  desirability  or  difficulty,  do  not  contribute  to  workload  determination, 
since  they  are  only  loosely  connected  to  task  demands  and  pilot  response."  Certainly,  as  in  the  above 
example,  a  pure  handling  qualities  scale  may  not  give  an  accurate  estimate  of  workload.  It  is  clear  that 
subjective  assessments  of  workload  must  be  derived  from  rating  scales  specifically  designed  for  the  purpose. 

Subjective  assessments,  in  general,  as  sometimes  unreliable,  for  example,  it  is  known  that  they  are 
susceptible  to  both  inter-  and  intra-subject  inconsistency.  In  particular,  subjective  ratings  for  what 
may  be  minimal  changes  in  handling  characteristics  can  be  misleading  by  suggesting  the  existence  of  larger 
differences,  especially  if  the  ratings  were  obtained  under  different  conditions.  Such  anomalies  ma,  be 
due  to  a  poorly  designed  rating  scale,  because  the  test  pilot  has  varied  his  assessment  strategy,  or 
because  of  undetected  changes  in  flight  conditions. 

This  problem  is  cyplfled  in  the  trial  of  a  powerful  rudder  autostablllser  fitted  to  the  BAC  221  slender 
delta  supersonic  research  aircraft.  Laceral  directional  handling  characteristics  during  the  landing 
approach  were  assessed  by  three  test  pilots.  The  task  consisted  of  a  "side-step"  mmoeuvre  at  a  height 
of  75m  (250ft)  placing  the  aircraft  to  one  side  of  the  centre  line.  An  "S"  turn  was  necessary  to  realign 
the  aircraft  with  the  runway,  thereby  testing  the  effectiveness  of  the  system.  Different  autostablllser 
settings  were  evaluated  and,  as  it  va»  possible  to  vary  these  in  flight,  the  associated  handling  charac¬ 
teristics  were  compared  under  similar  conditions.  The  Cooper-Harper  rating  scale  was  used  for  this 
purpose.  The  pilot’s  heart  rate  was  monitored  on  several  flights  so  that  mean  values  for  each  approach 
could  be  compared.  It  was  also  possible  to  examine  the  relationship  between  levels  of  heart  rate  and 
ratings  of  handling  qualities. 

Except  for  the  extreme  autostablllser  settings  and  for  "no  autostab"  approaches  when  ratings  and 
hesrt  rates  were  appreciably  higher,  results  ware  disappointing.  Heart  rate  values  were  inconsistent. 


tempting,  therefore,  to  conclude  thet  the  difference*  between  the  verloue  eutostebillser  setting*  were 
lnconsequontal  and  that  heart  rata  measurement  correctly  interpreted  this  fact. 

Disagreements  between  workload  assessment  and  heart  rate,  which  have  been  rare,  have  tended  to  occur 
during  relatively  undemanding  tasks  when  changes  have  bean  minimal  or  unimportant. 


FIGURE  3.  VC-10.  MEAN  30a  HEART  RATE  VALUES  FOR  NORMAL  CONDITIONS  AND 

SEVERE  TURBULENCE,  a.  3  ,  b.  5°/3  .  APPROACHES  AND  LANDINGS 

Different  weather  conditions  can  Influence  handling  to  varying  extents,  and  turbulence,  in  particular, 
causes  increased  workload  by  degrading  stability  and  control.  This  is  especially  noticeable  during  a 
flight  tank  where  accurate  tracking  is  required.  Fig.  3  compares  the  heart  rate  responses  of  a  pilot 
flying  two  different  types  of  approaches  and  landings  in  severe  turbulence  vith  mean  rates  for  similar 
approaches  flown  in  relatively  smooth  conditions.  This  example  is  from  a  flight  trial  of  noise  abatement 
approaches  using  a  BAC  VC  -  10,  (1A).  It  can  b*  seen  that  there  are  marked  Increases  in  heart  rate  for 
both  Jypgs  approach  though  the  Increase  is  marginally  greater  for  the  earlier  and  steeper  section  of 
the  5  /3  two-segment  profile  when  compared  with  tne  3  gradient.  These  findings  agreed  closely  with  the 
pilot's  assessment  of  the  changed  handling  qualities  and  workload.  He  considered  that  turbulence  Increased 
the  workload  more  for  the  two-segment  than  for  the  conventional  approach,  especially  during  the  acquisition 
and  early  part  of  the  glide  slope 

Pilots  occasionally  reveal  some  degree  of  hies  towards  or  against  a  particular  experimental  flight 
condition  which,  based  on  falacious  reasoning,  may  affect  their  Judgement  and  result  in  misleading 
subjective  ratings. 

In  the  early  stages  of  a  series  of  flight  trials  to  evaluate  various  types  of  noise  abatement 
approaches  initial  pilot  opinions  of  7lj°/3°  two-stage  flare  approaches,  in  a  HS  Andover,  were  unfavorable 
Pilots  felt  instinctively  that  transiting  from  a  7*3°  slops  to  one  of  3°  at  a  height  of  200  ft  would  be 
too  demanding  and  they  rated  the  workload  quite  high.  However,  from  the  beginning  of  the  trial  heart 
r ate  responses  for  this  approach  profile  were  similar  to  those  for  conventlal  3  approaches.  Careful 
thought  and  subjective  re-analysis,  by  the  two  test  pilots  who  flew  early  sorties,  led  to  a  review  of 
their  original  subjective,  assessment  of  workload.  Subsequently,  these  pilots,  together  with  other 
participating  pilots,  tended  to  prefer  the  7J3°/3°  approaches  to  the  3°.  They  considered  that  improved 
handling  on  the  steep  segment  was  an  important  factor  in  mairtaining  a  reasonable  level  of  workload. 

Heart  rate  and  workload  assessments  showed  good  agreement  (13) . 

In  this  instance  both  the  task  and  the  handling  qualities  are  changed  but  the  net  result  is  that 
workload  and  heart  rate  are  unchanged. 

The  previous  trials  were  concerned  mainly  with  stability  and  control  which  tc.  many  people  used  to 
be  the  accepted  interpretation  of  the  term  handling  qualities.  But,  as  stated  earlier,  evaluation  of 
guidance  systems  is  also  relevant;  indeed  a  large  proportion  of  test  flying  st  Bedford  is  directed  to 
evaluating  approach  guidance  displays  and  systems. 

A  typical  trial,  which  took  place  in  1971,  was  to  evaluate  an  airborne  visual  approach  indicator 
(VASIO  presented  aa  a  HUD.  For  the  purpose  of  the  experiment  only  omnidirectional  runway  edge  lights 
and  green  threshold  lights  were  used.  All  other  lights  were  estlnguished  and  moonless  nights  were 
selected  for  the  trial  sorties.  Two  pilots  alternated  as  experimental  pilot  (PI)  and  co-pilot  (P2)  for 
four  sorties  giving  a  total  of  32  approaches.  Each  run  ended  in  an  overshoot  at  30m  (100  ft).  Four 
difference  runways  were  used  in  order  to  reduce  any  element  of  learning,  though,  as  it  happened,  varying 
weather  conditions  eliminated  this  effect.  The  aircraft  used  for  this  stag*  of  the  trial  was  an  HS 
Comet  2E. 
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Gild*  slope  performance  vn  considerably  bat tar  with  tha  HUD- VAST  and  the  heart  rate  of  the  handling 
pilot  waa  reduced.  Coiapared  with  no-aid  approachea,  the  overall  dacreaae  waa  A. 2  bpa  for  one  and  6.6  bpa 
for  tha  other  pilot;  whereaa  there  were  only  negligible  dlffarencee  In  heart  rate  when  acting  aa  co-pilot. 
Both  pllota  were  keen  to  point  out  that  workload  waa  aignlficantly  reduced  by  the  HUD  and  they  anticipated 
larger  decraaaa  in  their  heart  ratea  but  agreed  that  the  benefit  of  the  aid  would  have  been  greater  had 
the  experimental  approachea  ended  in  a  landing  rather  than  in  an  overahoot. 

Improvement  in  performance  without  any  evidence  of  lncreaaed  workload  van ,  in  ltaelf,  adequate  proof 
of  the  advantagea  of  the  HUD-VASI.  Nevertheless,  the  trial  sclentiata  were  delighted  to  have  evidence  of 
a  reduced  workload  aa  well.  Unfortunately,  becauae  of  the  aavere  and  widely  differing  weather,  the  wide 
variations  in  heart  rate  between  the  aortlea  precluded  atatiatlcal  significances. 

Uuae  studies,  which  have  used  examples  of  data  obtained  during  operational  test  flying,  daaonatrate 
the  use  of  heart  rite  as  an  Indicator  of  workload.  Such  data  have  proved  to  be  of  great  value  in  the 
overall  study  of  pilot  workload,  but  it  has  to  be  admitted  that  the  direct  value  of  heart  rate  measure¬ 
ment  In  evaluating  handling  qualities  la  still  not  clear.  Nevertheleaa,  pllota  and  engineers  associated 
with  these  trials  concider  monitoring  heart  rate  to  be  a  worthwhile  adjunct  to  those  techniques  commonly 
used  in  flight  evaluation.  It  should  be  noted  that  theae  examples  have  been  confined  to  trials  where  the 
pilot  has  been  handling  the  controls.  Heart  rate  changes  for  handling  pilots  have,  for  the  most  part, 
proved  to  be  reliable  indicators  of  Important  changes  in  workload  when  the  task  has  been  realistically 
desmndlng.  But  in  other  trials  at  Bedford,  where  the  pilot  has  been  In  a  monitoring  role  hln  heart  rate 
responses  did  not  appear  to  reflect  changes  in  workload  with  anything  like  the  same  reliability.  This 
difference  in  heart  rate  sensitivity,  between  the  pilot  in  the  control  loop  and  the  pilot  outside  the 
loop,  is  Important. 

Dlacuaalon 


A  large  number  of  reports  on  aircraft  handling  trials  refer  to  related  levels  of  pilot  workload. 
However,  it  is  patently  obvious  that  In  most  instances  assessing  handling  characteristics  was  the  prlausry 
objective  whereas  estimating  workload  waa  very  such  a  secondary  aim.  This  approach  is  usually  adequate 
and  leads  to  realistic  estimations  of  workload,  but  it  is  apparent  that  sometimes  a  pilot's  tsaln  concern 
with  handling  has  adversely  affected  his  ability  to  assess  workload.  Ellis  (16),  in  pointing  out  that 
it  la  important  that  ratings  for  workload  and  handling  qualities  are  not  confused,  wrote:  "When  pllota 
are  asked  to  make  a  formal  assessment  of  workload  as  a  primary  measure,  it  should  be  absolutely  certain 
that  workload  is  the  ultimate  aim  of  the  exercise."  Ellis  also  observed:  "Workload  is  always  important 
in  handling  qualities  investigations  and  so  pilots  should  be  encouraged  to  comment  on  It  and  rate  it  but 
workload  should  not  be  allowed  to  usurp  the  place  of  the  handling  qualities  rating  where  the  latter  In 
the  more  appropriate  measure." 

It  ia  clearly  an  advantage  to  use  a  specially  constructed  rating  scale  for  assessing  workload  during 
flight  testing.  Unfortunately,  such  scales  suffer  from  the  same  problems  and  attract  the  same  criticism 
aa  do  pilot  opinion  scales  for  assessing  handling  qualities.  By  using  some  other  method  of  estimating 
workload  it  may  be  possible  to  augment  pilot  opinion  and,  perhaps  occasionally,  resolve  anomalous  findings. 
Physiological  variables,  which  have  been  recorded  by  many  research  workers  during  studies  of  pilot  workload 
and  stress,  may  be  used  for  this  purpose.  The  literature,  though,  contains  few  reports  where  the  relation¬ 
ship  between  handling  qualities,  workload,  and  physiological  responses  has  been  studied  in  detail. 

In  1962  Roman  and  lamb  (17),  in  discussing  the  results  of  measuring  heart  rate  in  flight,  observed 
that:  "Pulse  rates  correlate  well  with  the  pilot's  estimates  of  the  difficulties  connecting  with 
handling  the  aircraft  during  any  one  phase  of  flight."  Rowen  (18)  pointed  out  that  high  heart  rates 
recorded  from  the  pilot  of  the  M2  lifting  body  were  associated  with  the  poor  lift/drag  characteristics, 
which  made  particularly  heavy  demands  on  pilot  skill.  By  measuring  pilot's  heart  rates,  Billings  et.  ml. 
(19)  demonstrated  that  helicopters  fitted  with  hydraulic  boost  systems  were  significantly  less  demanding 
to  fly,  Hasbrook  and  his  co-workers  (20)  used  heart  rate  measurement  to  augment  pilot  opinion  during  the 
flight  evaluation  of  a  new  Instrument  display.  They  vere  able  to  show  that  the  now  display,  which 
reduced  panel  space  by  25Z,  was  an  acceptable  alternative  to  the  conventional  display. 

These  examples  from  the  literature  demonstrate  a  relationship  between  handling  characteristics  and 
workload  as  Indicated  by  heart  rate.  But  what  la  the  extant  of  this  relationship?  Is  it  reliable  and 
consistent?  Can  it  be  usefully  employed  in  flight  evaluation?  Is  monitoring  pilot's  heart  rate  during 
test  flying  a  practicable  exercise? 

The  technique  of  monitoring  heart  rate  is  relatively  simple,  it  does  not  intrude  into  the  flight 
task  nor  does  it  compromise  flight  safety.  It  Is  readily  accepted  by  pilots.  In  fact,  Bedford  test 
pilots  have  co-operated  to  the  extent  of  applying  their  own  electrodeu  and  preparing  their  monitoring 
equipment  for  flight  on  many  occasions.  The  resulting  heart  rate  data  ere  often  studied  with  interest 
by  the  pilots  who  find  them  helpful  in  recalling  various  aspects  of  the  sortie. 

Heart  rate  does  not  give  absolute  values  of  workload  and  so  in  order  to  obtain  meaningful  results 
it  is  necessary  to  use  it  as  a  comparative  measure.  It  is  worth  noting  that  pilot  rating  scales,  though 
appearing  to  give  absolute  values,  because  they  are  subjective,  are  really  scales  of  comparison  as  well 
(21). 


The  examples  of  flight  trials  presented  in  the  previous  section  relied  on  comparison  with  another 
experimental  condition  or  with  some  form  of  datum;  and  wherever  possible  the  comparison  was  made  during 
the  same  sorties.  The  trials  by  Billings  ard  his  colleagues  (19),  and  by  Hasbrook  et.  al.  (20),  referred 
to  above,  were  similarly  b>*ed  on  comparison. 

To  compare  the  changes  in  workload  caused  by  different  handling  characteristics  it  is  necessary  to 
ensure  that  other  aspects,  such  as  the  flight  task  Itself,  remain  constant.  This  is  sometimes  difficult 
to  achieve.  Tor  example,  different  flap  settings  on  the  approach  may  change  handling  but  may  also  require 
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Subjective  ••■••manta  of  handling  and  workload  appaar  to  ba  more  conslatant  whan  thay  demand  a 
high  laval  of  piloting  skill  than  whan  thay  require  llttla  affort.  Likawlaa,  haart  rata  maaauramanta 
ara  mors  conslatant  and  rallabla  If  tha  experiaental  flight  task  If  realistically  demanding.  Data  froa 
tha  Bedford  studies  show  that  teat  pilots  have,  for  tha  moat  part,  given  astlaatea  of  workload  which 
•greed  wall  with  thalr  haart  rata  levels.  But  tha  agraaaant  was  batter  when  heart  rates,  and  preeuaably 
workload  levels,  were  higher;  and  as  aight  ha  expected,  anomalies  tended  to  occur  aore  often  when  workload 
and  haart  rates  were  low.  It  is  interesting  to  note,  though,  that  from  measurement  of  heart  rate  end 

finger  traaor,  Nicholson  and  hia  co-workers  (22)  concluded  that:  " .  high  workload  assodatad  with 

difficult  approaches  and  landings  rendered  the  pilot's  subjective  assessment  aore  variable." 

Results  from  workload  studies  made  in  real  flight  are  generally  aore  reliable  than  those  made  In 
simulated  flight  and  this  must  be  particularly  so  in  studlaa  of  handling  -  related  workload.  Unfortunately, 
It  la  difficult  to  set  up  well  controlled  flight  experiments  and  so  there  la  a  strong  temptation  to  resort 
to  using  laboratories  and  simulators  for  workload  investigations.  Protagonists  of  these  techniques  point 
to  the  undoubted  value  of  research  simulators  in  aasaasing  handling  qualities.  But  laboratory  and 
simulator  axperiements  tend  to  restrict  the  number  of  Input  parameters  to  which  the  pilot  la  assumed  to 
respond.  In  real  Ufa  the  pilot  Is  faced  with  a  wide  range  of  input  information  -  much  of  It  redundant, 
but  all  liable  to  have  some  effect  on  his  behavior  and  hanca  his  workload. 

As  noted  in  tha  HUD  trial,  tha  Inability  to  control  such  variables  aa  weather,  and  the  small  number 
of  experimental  sorties  -  limited  by  the  high  cost  of  flying  aeroplanes  -  often  results  In  differences 
in  hoart  rate  which  are  not  statistically  significant.  Nonetheless,  trends,  especially  If  they  support 
pilot  opinion,  may  be  quite  adequate,  but  even  If  they  conflict,  heart  rate  data  can  be  most  valuable 
In  attracting  attention  to  possible  ambiguities.  Further  examination  and  discussion  with  the  pilot  may 
then  reveal  previously  undetected  factors.  Beat-to-beat  heart  rate  is  particularly  useful  in  Identifying 
short  term  changes  In  workload  which  may  not  be  obvious  to  a  pilot  making  an  overall  assessment. 

Unfortunately,  most  of  the  flight  trials  at  Bedford  did  not  use  numerical  rating  scales  for  assessing 
workload  and  some  trials  did  not  use  them  for  assessing  handling.  These  omissions,  together  with  the 
Halted  number  of  experimental  sorties,  haa  precluded  any  opportunity  for  statistical  analysis  of  tha 
relationship  between  handling  qualities,  workload  and  heart  rate.  A  flight  trial  designed  to  examine 
this  relationship  aore  closely  Is  currently  underway.  Three  test  pilots,  flying  three  different  aircraft, 
compare  handling  characteristics  during  various  demanding  tasks.  These  include  the  approach  and  landing, 
low-level  high  speed  flight,  and  formation  flying.  Pilots  use  the  Cooper-Harper  scale  for  rating  handling 
qualities  and  a  10  point  scale  (based  on  Cooper-Hsrper)  for  rating  workload.  Heart  rate  is  recorded  on 
all  aortlea. 

It  Is  hoped  that  this  Investigation  will  result  In  enough  data  to  permit  soae  degree  of  statistical 
analysis.  It  is  worth  noting,  though,  a  point  made  by  McGregor  (23)  who  stated:  "One  of  the  criticisms 
of  numerical  pilot  rating  scales  as  opposed  to  adjectival  scales  Is  that  statistical  games  will  be  played 
with  numbers  that  are  not  statistically  aaantngful."  He  continued:  "If  statistical  Indices  are  used  they 
must  be  adequately  enough  defined  to  enable  the  reader  to  assess  their  validity  and  sufficient  data 
presented  to  allow  a  check  to  be  made  of  the  results.”  With  this  in  mlcd,  it  is  not  Intended  to  attempt 
to  identify  any  aatheaatlcal  relationship  between  assessments  of  handling  or  workload,  based  on  rating 
scales,  and  heart  rate.  The  individuality  of  pilots  makes  this  virtually  an  Impossible  task  anyway. 

Summary  and  Conclusions 

There  Is  obviously  a  distinct  advantage  In  augmenting  a  pilot's  subjective  assessments  of  handling 
qualities  and  workload.  This  paper  presents  practical  examples  where  pilot's  hearc  rates  have  been  used 
to  augment  their  opinions  of  handling  and  workload  during  various  flight  trials.  These  studies  provide 
good  evidence  to  show  that,  In  general,  this  technique  gives  reasonably  good  Indications  of  the  workload 
generated  by  particular  handling  qualities. 

The  technique  of  monitoring  heart  rate  is  simple.  It  Is  accepted  by  pilots,  and  It  Is  compatible 
with  test  flying.  To  Improve  reliability  and  consistence  the  flight  task  should  be  realistically  demanding 
and  require  the  pilot  to  be  In  the  handling  loop.  Comparisons  between  experimental  conditions,  or  with 
some  form  of  datum,  give  more  meaningful  results;  wherever  possible  comparisons  should  be  made  during  the 
same  sorties.  Raw  data  in  the  form  of  beat-to-beat  heart  rate  are  Invaluable  for  revealing  rapid  and 
short  duration  changes  in  handling  qualities  which  affect  workload. 

In  this  way,  potentially  misleading  results  can  be  Identified  in  good  time,  thereby  drawing  attention 
to  the  need  for  further  investigation.  Anomalous  findings  may  be  resolved  by  examination  of  heart  rate 
data  and  by  discussion  with  the  pilot. 

The  author  has  not  made  any  attc'ot  to  satisfy  strictly  scientific  criteria,  the  primary  objective 
being  to  draw  attention  to  the  value  of  using  heart  rate  as  a  flight  test  procedure.  But  In  addition. 

It  is  hoped  to  stimulate  thought  and  dis«  ission  so  that  it  may  be  possible  to  reduce  some  of  the  anomalies 
found  In  handling  qualities  and  workload,,  referred  to  by  Westbrook  at.  al.  (1). 
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The  use  of  brainwaves  (EEC)  for  the  enhancement  of  the  performance  of  aircraft  pilots  is  an  Idea 
which  requires,  for  Its  development,  the  integration  of  two  previously  independent  linos  of  .esearch 
endeavor:  human  performance  assessment  and  central  nervous  system  neurophysiology.  A  human  performance 
research  paradigm  specifically  relevant  to  the  study  of  pilot  performance,  in  the  context  of  which  the 
use  of  brain  waves  My  feasibly  be  studied,  will  be  discussed  later.  Attention  is  now  directed  to  the 
state  of  the  art  of  brain  wave  research  and  brain-behavior  relationships,  specifically  those  aspects  which 
are  considered  to  be  feasibly  and  usefully  applicable  for  potential  use  in  simulated  aircraft  craw  stations 
or  eventually  in  a  real-world  environment. 

BASIC  RESEARCH 


Two  basic  types  of  paradigms  have  been  employed  in  studies  of  brain  waves  and  performance.  In  the 
first  case,  spontaneous,  ongoing  KEG  is  monitored,  and  fraquency  and  amplitude  for  seme  time  period 
(usually  preceeding,  during,  and/or  succeeding  some  exparlMntal  treatment)  are  related  to  various  aspects 
of  performance.  Usually  this  sort  of  treatment  has  Included  use  of  an  intervening  variable  called,  most 
frequently,  activation  or  arousal;  Davies  and  Parasuraman  (1977)  have  identified  four  separate  types  of 
experiMntation  within  this  rubric.  The  first  type  of  study  "has  attempted  to  discover  whether  decrements 
in  (signal)  detection  rate,  or  sometimes  increments  in  detection  latency,  are  paralleled  by  corresponding 
changes  in  one  or  more  pyschophyslological  measures.  .  ."  Another  approach  has  attempted  to  identify 
psychophysiological  (not  only  EEG,  of  course)  processes  and  evanta  which  discriminate  between  periods 
preceding  successful  as  opposed  to  unsuccessful  attempts  to  detect  a  signal.  A  third  has  involved  varying 
environmental  parameters  presumed  to  affect  arousal,  thereby  causing  an  ultlMte  effect  upon  perfonance 
if  in  fact  arousal  and  performance  are  related.  Frequently  these  studies  have  led  to  the  observation  of 
performance  changes  without  concoomltant  variation  in  physiological  Masurea  of  arousal;  as  Davies  and 
ParasuraMn  have  put  it,  11 .  .  .  a  dissociation  of  performance  indices  end  physiological  measures  occurs." 

A  fourth  approach  has  attempted  to  predict  individual  differences  in  level  end  quality  of  performance 
from  baseline  scores  on  physiological  Masurea,  through  the  so-called  arousal  hypothesis  of  vigilance. 
Generally,  this  hypothesis  posits  an  Inverted  U  relationship  between  arousal  and  perfonance;  at  low 
levels  of  arousal  errors  of  omission  (e.g.,  missed  detections  of  targets)  occur,  and  at  high  levels  the 
well  known  detrimental  performance  effects  of  stress  and  high  anxiety  are  seen.  Arousal  is  measured 
independently,  usually  via  physiological  events. 

From  his  1970  review,  however,  O'Hanlon  is  led  to  the  conclusion  (quoted  by  Davies  and  ParasuraMn 
1977)  that  "No  reliable  physiological  index  of  eltartness  has  been  accepted,  elthough  several  promising 
ones  have  been  proposed.  No  physiological  variables  have  been  found  that  are  as  sensitive  to  task  and 
environmental  effects  as  is  performance.  No  underlying  process  has  been  so  clearly  defined  as  to  permit 
rational  control  of  cerebral  vigilance."  Davies  and  ParasuraMn  agree  with  his  dlscourgaglng  assessment, 
and  suggest  that  methodological  deficiencies  result  both  from  inadequate  perfonance  measurement  and  from 
the  lack  of  an  extant  task  taxonomy  to  Mka  sense  of  the  ~normoua  nmber  of  exparlMntal  situations  which 
have  been  used.  Beck  in  his  1975  paper  considers  that  brain  waves  cannot  feaslblly  be  brought  under 
stimulus  control,  and  therefore  does  not  Include  EEG  studies  (as  constrasted  with  evoked  potential  studies) 
in  his  review. 

Nevertheless,  a  few  semi-consistencies  have  been  observed  across  experiments  and  laboratories, 
although  virtually  no  general  conclusion  can  be  put  forward  which  does  not  admit  of  some  exception  or  can 
be  considered  invulnerable  to  challenge.  Certainly  decrsMnts  in  detection  perfonance,  rate,  and  or 
latency  are  usually  seen  to  be  accompanied  by  EEG  changes  as  fatigue  develops  over  the  course  of  lengthy 
experimental  sessions,  and  it  is  clear  that  the  probability  of  failure  to  respond  to  transitory  signals 
altogether  Increases  considerably  under  conditions  of  lowered  arousal  -  i.e.,  when  the  subject  is  bored 
or  sleepy.  Further,  several  studies  have  indicated  that  individuals  with  greater  baseline/abillty  for 
GSR  (though  not,  oo  far,  for  EEG)  do  better  lndetectlon  situations  and  generally  are  better  able  to 
maintain  a  state  of  vigilance.  Variation  in  the  complexity  of  visual  stimuli,  memory  task  requirements, 
and  differential  hemispheric  activation  via  varying  stimulus  modalities  have  all  been  shown  to  affect  EEG 
measures  in  relatively  stable  wsys.  Brain  waves  have  been  reliably  shown  to  vary  with  behavioral  sleep 
events  in  a  relatively  stable  manner,  and  consistently  over  the  time  course  of  a  normal  nights  sleep, 
and  thus  probably  reflect  variation  in  arousal  (albeit  at  the  very  low  end  of  the  scale).  Gales  (1977) 
points  out  that  .  .  high  alpha  and  beta  frequencies  are  more  sensitive  to  discrete  changes  in 
stimulation  than  are  lover  alpha  frequencies  and  theta  activities.  .  . "  indicating  that  the  relationship 
of  EEG  events  to  arousal  is  more  easily  studied  in  alert  states.  Gales'  statement  is  made  in  summary, 
and  follows  a  passage  in  his  paper  where  he  acknowledges  that  arousal  is  not  a  unitary  state  which  has 
straight- forward  and  systematic  relationships  with  measures  of  behavior  or  of  subjective  report.  He  goes 
on  to  say  that  changes  in  theta  and  lower  ranges  of  alpha  reflect  other,  presuMbly  ron-task-relevant, 
effects. 

Other  attempts  to  relate  ongoing  EEG  experimentally  to  vigilance,  or  to  other  behavior,  especially 
under  conditions  of  fatigue  have  been  made.  Consistent  with  data  from  other  situations,  it  is  typically 
found  that  changes  in  fatigue  and  arousal  can  be  Inferred  from  brain  wave  activity  (power  shifts  fror 
higher  toward  lower  frequencies,  i.e.,  from  beta  toward  theta  and  delta). 


Prepared  for  the  Environaiental  Physiology  Program,  Office  of  Naval  Research,  USA 


94 


The  work  of  O'Hanlon  and  deatty  (1977)  supports  the  general  fora  of  the  arousal  hypothesis  of 
vigilance,  showing  that  percentage  of  alpha  and  theta  increases  and  that  beta  decreases  were  related  to 
variation  in  performance  on  a  simulated  radar  watching  task.  As  Beatty  and  O'Hanlor  point  out,  alpha 
may  either  Increase  or  decrease  with  arousal;  concurrent  changes  In  theta  and  beta  must  be  taken  into 
account  (l.e.,  are  frequencies  Increasing  or  decreasing?)  before  sense  can  be  made  of  the  variation  In 
alpha. 

Finally,  there  is  some  Indication  (e.g.,  Dlmond  1977)  that  hemispheric  differences  may  relate  to 
arousal  and  vigilance  performance.  Split  brain  studies  suggest  that  the  two  hemispheres  may  have 
different  vigilance  systems.  Perhaps,  as  suggested  by  Jerison  (1977),  the  left  hemi  there  deals  with 
selective  attention  and  the  right  tilth  continuous  attention. 

It  is  probably  a  fair  statement  that  there  is  not  much  promise  of  new  and  exciting  use  of  ongoing 
EEG  for  the  enhancement  of  pilot  performance  at  the  moment.  There  appears  to  be  sufficient  consistency 
in  the  literature  so  that  some  confidence  may  be  felt  in  the  use  of  changes  In  brain  wave  power  across 
frequencies  to  infer  rather  general  stats  changes.  One  can  tall  when  a  subject  is  getting  drowsy,  has 
gone  to  sleep  (and  brief  bursts  of  sleep  frequently  appear  under  sustained  task  performance  requirements, 
especially  when  some  degree  of  sleep  deprivation  exists),  or  to  a  lesser  degree  of  certainty,  is  si. vly 
Inattentive.  This  kind  of  Information  is  not  without  Interest  and  use,  but  its  lack  of  information 
specificity  and  very  low  data  rate  lead  to  the  conclusion  that  the  Instrumentation  and  data  processing 
requirements  to  collect  and  act  upon  It  would  not  likely  pay  off  in  greatly  enhanced  performance,  though 
its  potential  for  monitoring  organlsmlc  state  is  obvious. 

The  event-related  potential  (ERF)  is  another  matter.  The  ERF  Is  an  EEG  response  evoked  by  a  specified 
stimulus  and  usually  averaged  over  a  group  of  trials.  A  series  of  positive  and  negtatlve  deflections 
is  observed,  usually  conceptually  and  empirically  divided  into  two  categories.  The  earlier  components, 
those  occurring  In  the  first  100  as  or  so  subsequent  to  the  stimulus,  are  referred  to  as  exogenous — they 
reflect  characteristics  Intrinsic  to  the  stimulus  event  Itself,  such  as  loudness,  brightness.  Intensity, 
or  other  psychophysical  attributes.  This  activity  is  considered  to  represent  the  processing  of  sensory 
lnformetlon.  The  later  components,  up  to  perhaps  600  ms  beyond  the  stimulus,  are  considered  to  be 
endogenous,  reflecting  cognitive  processes  and  attributes  of  the  stimulus  deriving  not  from  its  physical 
properties  but  rather  from  Its  task-ralevant  context  (e.g.,  whether  It  is  to  be  counted  or  Ignored,  Its 
suprislngness,  its  information  value,  etc.).  It  Is  .hate  latter  components,  reflecting  as  they  seem  to 
aspects  of  performance  potentially  applicable  to  cockpit  or  craw  station  situations  which  are  of  primary 
Interest.  The  following  discussion  of  these  later  ERF  components  and  their  studied  relationships  is 
largely  derived  from  comprehensive  and  thorough  reviews  by  Donchin  at  al  (1977)  and  Beck  (1975),  and  to 
a  lesser  degree  from  the  recent  chanter  by  John  and  Schwartz  (1978).  The  catalog  of  endogenous  components 
offered  by  Donchin  et  al  (1977)  Is  worth  presenting  in  full: 

N200.  This  component  is  elicited  whenever  a  rare  or  unexpected 
event  occurs.  It  Is  of  particular  Interest  because  It  can  be  elicited 
by  stimuli  that  are  in  the  periphery  of  the  subject's  attention.  Unlike 
the  other  endogenous  components  it  appears  sensitive  to  the  modality  of 
the  stimulus.  The  positive-going  return  of  this  component  Is  sometimes 
labeled  P3a. 

P300.  This  robust  endogenous  component  is  reliably  recorded  In 
association  with  task  relevant,  rare  stimuli.  We  apply  this  label  to 
a  component  whose  latency  may  range  from  275  to  at  least  600  msec. 

It  Is  characterised  by  its  scalp  distribution  as  it  tends  to  be  larger 
In  the  central  and  parietal  electrodes.  It  is  further  characterized 
by  a  very  specific  response  to  experimental  manipulations. 

Slow  Wave.  This  Is  a  slow  potential  shift  that,  as  far  as  is 
known,  is  affected  by  the  same  variables  that  affect  P300,  except 
that  It  has  a  different  scalp  distribution.  Whereas  F300  appears 
largely  as  a  positive-going  potential  peaking  on  the  parietal 
scalp,  the  Slow  Wave  is  positive-going  at  the  parietal  electrodes 
and  negative-going  in  frontal  electrodes.  As  the  Slow  Wave  is  so 
closely  associated  witn  F300,  it  will  not  be  discussed  further  in 
this  report. 

The  Contingent  he.  .ive  Variation  (CHV) .  This  term  is  normally 
used  to  describe  tha  sliw  negative  shift  of  potential  that  occurs 
the  warned  foreperiod  preceding  a  motor  or  mental  task.  It  begins 
approximately  400  msec  after  the  warning  stimulus  and,  normally, 
terminates  after  the  Imperative  stimulus,  that  is,  the  stimulus 
demanding  a  response  or  decision  by  the  subject. 

The  Readiness  Potential  (RF).  As  the  CNV,  this  is  an  event- 
preceding  negative  shift.  It  is  distinct  in  the  sense  that  It 
appears  prior  to  self-paced  voluntary  responses.  Its  occurrence 
is  independent  of  the  presence  of  an  eliciting,  or  command,  stimulus. 

As  John  and  Schwartz  (1978)  stated,  these  endogenous  components  have  been  studied  in  connection  with 
arousal,  attention,  selective  attention,  emotional  valence,  assessment  of  novelty,  time  estimation, 
uncertainty,  detection  of  targets,  differential  identification  of  stimuli  indepedent  of  size  and  shape, 
and  the  semantic  classification  of  linguistic  symbols. 

Some  of  the  more  potentially  useful  and  applicable  findings  that  have  occurred  with  some  consistency 
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o  It  is  clear  that  ESP  component  amplitude  Is  related  to  attention  For  example,  P300  (whose 
latency  actually  varies  from  250  to  600  os)  is  rather  significantly  amplified  by  the  perception  of  a 
meaningful  or  suprising  stimulus  (or  by  the  absence  of  a  stimulus,  when  one  Is  expected);  In  other  words, 
by  the  resolution  of  uncertainty.  This  basic  finding  has  been  observed  In  situations  Involving  reaction 
time,  signal  detection,  signal  confirmation,  patcern  completion,  motor  set,  as  veil  as  other  experimental 
paradigms.  The  finding,  that  P300  occurrence  follows  omission  of  an  expected  stimulus  seems  especially 
Interesting,  demonstrating  clearly  that  this  component  reflects  cognitive  rather  than  sensory  processing. 
The  potentials  evoked  by  missing  stimuli  are  Indistinguishable  in  confirmation,  from  those  evoked  when 
sensory  stimuli  are  In  fact  presented.  Results  of  studies  utilising  the  omitted  stimulus  paradigm  allow 
the  Inference  that  an  internal  model  of  sequences  of  stimulus  events  is  formed,  and  that  the  P300  Is 
evidence  of  a  mismatch  between  this  model  and  the  observed  (non)  event  which  unexpectedly  does  not  occur. 

o  in  general,  the  P300  Is  enhanced  only  when  t time bus  information  Is  being  actively  processed  and  is 
uniquely  associated  with  the  occurrence  of  a  signal  and  Its  correct  detection.  (Beck  1975) .  It  occurs 
subsequent  to  stimuli  in  any  sensory  modality.  Evidence  exists  that  amplitudes  and  latencies  vary  over 
the  scalp,  end  appear  to  Interact  with  different  task  analysis  requirements  for  cognitive  processing. 

o  there  is  some  Indication  that  auditory  ERP's  vary  lnterhemlaperlcally  as  a  result  of  a  liny. is tic 
task;  left-side  responses  are  larger  when  a  task  requires  linguistic  analysis  but  not  when  the  same  stimuli 
are  compared  non-linguist lcally. 

o  John's  (1978)  review  of  P300  variation  relating  to  semantics  and  logic,  ar.d  varying  over  the  scalp 
in  amplitude  and  latency,  lead  him  to  the  conclusion  that: 

these  articles  (e.g.,  Thatcher,  R.  W.  1976)  provide  compelling 
evidence  that  the  feature*,  of  the  ERP  are  influenced  dramatically 
by  subtle  semantic  relations  such  as  whether  stimuli  are  synonymous 
or  antonymous,  whether  they  are  bilingual  equivalents  or  not,  or 
whether  they  are  logically  true  or  not.  The  changing  latency  and/or 
anatomical  locus  of  the  LPC  differences,  observed  as  these  Information 
parameters  shift  within  an  overall  match-mismatch  paradigm  of  otherwise 
constant  design,  provide  perhaps  the  most  convincing  evidence  that  these 
ERP  features  relate  to  cognitive  mental  processes  rather  than  the 
nonspecific  or  physical  factors  in  these  experiments. 

The  CNV  has  also  proved  to  be  a  fertile  source  of  research  efforts  attempting  to  relate  aspects  of 
its  occurrence  to  stimulus  attributes.  The  slow  negative  shift  of  brain  potential  occurring  between  a 
warning  and  an  action  stimulus  which  is  referred  to  as  CNV  has  attracted  the  attention  and  Interest  of  a 
number  of  investigators,  resulting  in  a  body  of  literature  summarized  by  Beck  (1975)  into  groups  of  studies 
which  Interpret  the  CNV  as  reflecting  expectancy,  motivation,  conation,  or  attention.  Beck  asserts  the 
overall  Implications  that  (a)  the  CNV  Is  not  a  single  process,  and  (b)  that  Its  nature  and  cerebral 
topography  are  dependent  upon  the  state  of  the  organism  and  .he  task  imposed.  He  points  out  that  CNV 
magnitude  has  been  seen  to  relate  to  the  uncertainty,  intensity,  and  amount  of  information  In  the  action 
stimulus,  the  interstimulus  interval,  concentration,  and  anxiety. 

Rosenwelg  and  Lelman  (1968)  provide  a  useful  summary  of  the  major  research  approaches  to  the  study 
of  brain-behavior  relationships: 

Interest  in  the  relation  between  electrical  indicants  of  neural 
function  and  dimensions  of  behavior  has  taken  several  ferms.  First, 
considerable  research  has  been  directed  toward  the  examination  of  the 
neural  representation  of  cansory  inputs,  p*-: dculary  the  relationship 
between  neural  coding  and  generalizations  derived  from  the  data  of 
psychophysical  experluwnts.  A  detailed  specification  of  the  transfer 
operations  at  relay  nuclei  In  particular  sensory  systems  has  been  noted 
by  several  Investigators.  Second,  various  Invectlgators  have  eought  to 
relate  electrical  parameters  of  neural  functioning  to  more  elaborated 
behavioral  phenomena — dimensions  such  as  attention  and  learning.  This 
type  of  experimental  Inquiry  has  been  direct  to  anelv.es  of  mechanisms  of 
lrtegratlon  and  to  relating  regional  alectrophyslologlcsl  differences  to 
particular  attributes  ot  behavior.  Experimental  stratagems  have  been 
diverse,  although  most  studies  can  ba  considered  as  either  (a)  exhaustive 
analysis  of  the  synaptic  mechanisms  of  dlscrata  functional  systems  which 
may  form  a  basis  for  tha  analysis  of  uors  complex  operations  mediated  by 
these  systems;  or  (b)  determination  a  1  tha  behavioral  correlated  of 
particular  Intrinsic  wave  processes,  e.g.,  alpha  rhythm  or  hippocampal 
theta  rhyt.lim;  or  (c)  aaeasementa  of  the  Induced  electrographic  atates 
produced  by  special  behavioral  manipulation*  e.g.,  evoked  activity  and 
reaction  time.  (pp.  69-70) 

EEC  has  aevaral  atttlbute*  which  raeult  in  it*  popularity  for  use  in  situations  where  monitoring  1* 
required;  It  occur*  continuously  and  spontaneously,  and  it  la  readily  available  via  relatively  Inexpensive 
and  simple,  Instrumentation.  Spontaneous,  ongoing  EEC  can  be  automatically  recorded  nnd  analysed  /it.h 
relative  ease,  and  has  served  as  an  Indicator  of  many  disparate  varieties  of  states  ranging  from  the 
clir.c.lel  (e.g.,  death,  cchlsophrenla,  anxiety)  to  performance  (e.g,,  arousal,  perception,  processing 
efficiency).  As  Mlrsky  (1969)  hits  pointed  out: 

.  .  .  large  masaes  of  data  may  be  processed  quickly;  the  data  so  obtained 
may  be  more  easily  subjected  to  statistical  evaluation.  Methods  range 
from  total  Integrated  energy  (more  properly,  voltage)  through  base-line 
crossing,  frequency  analysis,  and  analyala  of  alope  changes  Into 
amplitude  measures,  (p.  325) 
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Evoked  electrical  potentials  differ  from  ongoing  F.EG  by  occurring  in  close  temporal  proximity  to  the 
stimuli  by  which  they  are  elicited,  by  their  relative  consistency  of  shape,  and  (usually)  by  much  smaller 
amplitudes  than  background  brain  wave  activity. 

Sem-Jacobsen  (1 97 1)  has  monitored  EEG  (among  other  physiological  indices)  from  pilots  under  stress 
in  real-life  missions,  and  has  shown  that  previous  quality  of  pilot  performance  can  be  used  to  predict 
occurrence  of  delta-theta  (2-8  hz)  activity  under  stress.  Sem-Jaccbsen  attributes  many  aircraft 
accidents  to  pilot  task  overload  which  elicits  a  freezing  under  conditions  of  extreme  stress,  reflected 
in  a  showing  and  eventual  flattening  of  btaln  wave  activity. 

A  POSSIBLE  SYNTHESIS 

A  recent  paper  by  Cooper  et  al  (1977)  presents  an  interesting  demonstration  of  cortical  potentials, 
large  enough  to  stand  out  from  background  EEG  and  be  recognized  easily  without  averaging,  which  occur  as 
a  result  of  changes  in  a  visual  display  consisting  of  a  video  recording  of  a  model  landscape,  comprising 
an  aerial  view  of  countryside  with  residential  buildings,  fields,  hedgerows  and  roads  in  the  foreground, 
and  with  mountains  and  sky  in  the  distant  background.  Various  vehicles  (vans,  trucks,  or  cars)  were 
introduced  and  moved  through  the  scene  at  varuous  infrequent,  unpredictable  times.  The  task  of  the 
subjects  was  to  (o)  detect  and  then  (b)  identify  by  type,  each  vehicle  which  entered  the  display  screen. 
Typically,  about  a  second  before  the  detection  response  was  made,  S's  eye  movements  indicated  a  sudden 
shift  to  the  interesting  for  the  present  consideration)  "a  large,  discrete,  well-defined,  positive-going 
potential  (occurred)  in  the  EEG."  This  brain  the  range  of  this  latency  varied  from  0.1  to  2.0  seconds, 
which  rules  out  a  simple  analogy  to  the  P300  paradigm  with  detection  the  eliciting  stimulus.  There  was 
no  such  wave,  incidentally,  associated  with  the  other  half  of  S's  task,  the  identification  of  the  type  of 
vehicle. 

As  the  authors  point  out,  since. 

...  it  appears  that  the  positivity  occurs  when  the  observer  sees  one 
of  the  class  of  ovents  that  he  has  been  told  to  detect. 

Both  the  role  of  thaee  potentials,  ti.-lr  distribution,  and  their 
positive  polarity  suggest  that  they  might  have  couaon  origins  with 
the  P300  component  of  the  cortical  evoked  potential  which  occurs 
characteristically  during  dlscrlmatlon  and  decision-making  tasks. 

A  less  well-defined  brain  event  observed  in  this  experiment  was 

a  slow  increase  of  negativity  that  started  before  the  detection 
positivity  .  .  ,  (which)  can  start  to  increase  while  the  eyes 
sre  scanning  other  parts  of  the  display,  and  it  decreases  after 
completion  of  the  motor  tasks  indicating  detection  and  recognition. 

Although  the  possible  relationship  of  this  observation  to  the  GNV  previously  described  seems  a  tempting 
area  for  spsculatlon,  the  authors  refrsln  from  its  pursuit.  It  is,  howevsr,  a  most  appropriate  area  for 
research,  and  an  elucidation  of  tha  relationship  of  these  phenomena  with  P300  and  CNV  might  open  the  door 
to  the  application  of  the  rsthar  considerably  bodv  of  knowledge  which  has  already  been  gathered  about 
these  brain  we\a  events  In  laboratory  settings  to  situations  much  more  verldicnl  than  the  typical  lr.boratory 
problem  now  utilized. 

THE  WORKLOAD  PARADIGM 

Perhaps  more  progress  has  bean  mads  toward  the  utilization  of  b-aln  wave  information  for  the  enhance¬ 
ment  of  pilot  perfonaa'ce  In  the  arts  of  monitoring  and  assessment  of  workload  than  in  any  other  area. 

This  concept  has  been  pu-eued  effectively  by  Dotichin  and  his  colleagues  (W.ckens,  Tarsal,  and  Donchin  1977, 
e.g.),  in  the  context  of  a  large-scda  effort  which  is  centered  in  Douchin'*  Cognitive  Psychophysiology 
Laboratory  at  the  University  of  Illinois  and  la  al  oed  at  tha  development  of  vary  closely  couplet  man  ■ 
computer  systems.  Wickens  (1978)  provides  a  brief  but  thorough  and  clear  deccvlptlun  of  the  purposes  of 
the  workload  measure  and  several  hints  aa  to  the  potential  operational  significance  of  Information  derived 
therefrom! 

In  the  ofr  line,  eieptlvc-  context,  if  may  be  asserted  that  en  intelligent 
computer-based  adaptive  cystem,  in  order  to  optimally  deploy  the  resow reus 
of  human  and  computer,  should  be  provided  with  a  real-time,  updatable 
estimate  or  model  of  the  state  (availability  and  allocation)  of  the  operator's 
attentional  resources,  eo  that  adaptivt  procedures  may  be  initiated,  the 
characteristics  of  this  state  estimate  and  their  potential  use  to  the  adaptive 
decision  maker  are  as  follows. 


(1)  The  available  resources  must  be  sufficient  to  meet  the  demands 
imposed  by  all  tasks  which  challenge  the  operator  at  any  time:  the 
characteristic  of  task  workload  or  reserve  capacity.  If  the  momentary 
workload  demands  become  sufficiently  great,  adaptive  aiding  procedures 
can  be  implemented  to  temporarily  unburden  the  human  operator.  These 
may  take  the  form  of  Initiating  alternate  otrategies  of  computer 
processing  or  information  display,  implementing  automatic  control  systems, 
or  of  calling  for  extra  manual  assistance  where  such  is  available. 

(2)  Even  if  the  resources  are  adequate,  the  attention  must  be 
allocated  properly  to  the  critical  tasks,  display  or  sources  of 
information,  so  that  important  sources  are  not  Ignored:  the 
characteristic  of  attention  allocation.  The  distinction  between 
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vorkload  and  allocation  la  crucial.  It  Is  self-evident  that  adequate 
capacity  Inadequately  deployed  nay  lead  to  non-optlmal  performance. 

Thus  If  resources  are  allocated  incorrectly  along  channels  of  lnconlng 
information  or,  If  critical  aources  of  information  are  being  Ignored, 
warning  or  cueing  signals  might  be  provided  to  redirect  the  proper 
distribution  of  attention  as  determined  by  some  preset  establishment 
of  priorities.  Alternatively,  system  characteristics  might  be  alerted 
with  derived  knowledge  of  the  distribution  of  operator  attention.  For 
example,  computer  resources  might  be  allocated  to  monitor  for,  or 
process  signals  along,  Information  channels  that  are  inferred  to  be 
non-attended. 

The  research  basic  paradigm  which  is  at  present  envisioned  for  the  evaluation  of  workload  measures 
and  their  incorporation  Into  a  computer-aided  aircraft  control  system  involves  an  operator  working  on  one 
or  more  tasks  which  can  be  varied  In  complexity  and  difficulty.  A  flash  of  light,  or  similarly  transient 
auditory  stimulus.  Is  Introduced;  this  stimulus  muot  be  counted  or  otherwise  used  In  the  performance  of 
a  secondary  (or  teritary)  task,  and  the  latencies  and  amplitudes  of  the  p700'a  elicited  by  these  so-called 
probes  (i.e.,  tests  of  attention,  reserve  processing  capacity,  etc.)  are  uacd  as  Indicators  of  operator 
workload;  the  controlling  computer  then  allocates  task  responsibility  between  man  and  machine  primarily 
on  the  basis  of  this  information,  thereby  optimizing  overall  system  performance.  Too  little  work  for  the 
operator  will  result  in  boredom  and  performance  deterioration,  so  it  may  be  wise  to  require  of  the  man 
some  work  which  theoretically  is  most  efficiently  and  competently  undertaken  by  machine.  Too  much  work 
for  the  operator,  particularly  under  especially  stressful  conditions  (e.g.,  landing  in  bad  weather,  combat, 
maneuver  In  crowded  airspace)  may  result  in  dangerous  overload  and,  again,  deterioration  of  performance; 
so  too  there  are  times  when  a  function  which  is  theoretically  best  handled  by  a  human  should  (or  even 
must)  be  assigned  to  the  machine.  The  overall  guiding  principal  is,  obviously,  to  use  knowledge  of  the 
operator's  reserve  processing  capacity  and  level  of  performance  In  the  light  of  current  task  demands. 

Wickens  (1978)  asks, 

"...  what  represents  an  appropriate  control,  non-adaptlve  system  against 
which  to  compare  adaptive  system  performance?  Should  this  control  be  one 
that  functions  with  maximum  aiding  or  minimum,  or  an  intermediate  level? 

Should  it  perform  under  best-case  or  worse-case  environments?  The  latter 
refer  to  those  conditions  that  are  not  typical,  of  real  world  operations, 
occurring  only  very  infrequently,  but  with  potentially  disastrous 
consequences?  What  performance  index  should  be  used  to  evaluate  and 
compare  such  systems? 

A a  a  specific  hypothetical  example,  consider  the  following:  An 
adaptive  multi-loop  control  system  Is  developed  which  will  shift  an 
axis  to  automatic  autopilot  control,  when  concurrent  task  workload  Is 
inferred  to  excede  some  criterion.  A  tracking  performance  index  of 
this  system  can  clearly  be  measured.  However,  against  what  should 
it  be  compared?  Tha  performance  of  a  maximum-aiding  (all  autopilot) 
system — to  which  It  will  almost  inevitably  be  inferior,  or  to  that  of 
a  minimum  aiding  (all  manual)  system  to  which  It  will  probably  be 
superior?  To  complicate  matters,  this  superiority  relationship  might 
be  reverted  by  introducing  "worse  case"  environments:  e.g.,  levels 
of  disturbance  which  the  autopilot  is  Incapable  of  handling,  or  auto¬ 
pilot  system  failures  which,  it  has  been  argued,  are  less  easily  detected 
under  autopilot  that  fn-the-  loop  situations  (Young  1969),  Under  these 
circumstances,  what  arbitrary  costs  should  be  assigned  to  latencies  in 
failure  detection,  as  opposed  to  errors  in  tracking  performance,  when 
deriving  an  ultimate  performance  index?  These  again,  are  questions  whose 
answers  may  lie  beyond  the  scope  of  the  conference,  but  which  Inevitably 
arise  when  adaptive  uses  of  workload  measures  are  considered. 

Some  of  the  experiments  which  have  provided  the  basis  for  the  development  of  his  paradigm,  and  which 
for  the  most  part  emanate  from  the  Cognitive  Psychophysiology  laboratory  at  the  University  of  Illinois, 
will  now  be  briefly  describid.  Ae  Donebin  (1976)  has  stated  in  general, 

specific  components  of  the  ERP's  have  been  shown  to  be  manifestations 
of  such  cognitive  events  as  the  preparation  to  respond,  the  preparation 
to  Intake  and  process  information,  the  registration  of  a  surprising 
evunt,  or  the  procescing  of  task-relevant  information.  Our  studies 
during  the  past  two  years  have  Indicated  that  the  ERF  is  a  particularly 
valuable  source  of  Information  about  the  operator,  information  which 
either  is  not  otherwise  available,  or  which  can  only  be  obtained 
through  potentially  disruptive  direct  questioning.  Briefly  stated,  the 
ERF  appears  to  be  a  sensitive  Index  of  the  strategies  adopted  by 
individuals  to  cope  with  their  assigned  tasks,  the  manner  In  which  they 
allocate  their  resources  and  distribute  their  information  processing 
capacities. 


EXPECTANCIES 


A  good  example  of  a  large  group  of  experiments  on  the  way  in  which  S'3  mental  set  or  expectations 
match  the  information  conveyed  by  the  stimulus  is  Squires  et  al  (1976),  which  demonstrates  changes  in 
P3UQ  amplitude  and  conformation  change  "as  a  function  of  the  prior  probability  of  the  stimulus  and  the 
specific  sequence  of  the  preceding  stimuli."  In  other  words,  for  any  given  level  of  frequency  of  an 
event  (the  intrinsic  probability  of  the  event)  the  suprlslngness  of  any  single  occurrence  of  it  is 
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significantly  affected  by  the  value  of  its  immediate  predecessors'  Under  conditions  of  relatively  low 
workload  a  previous  sequence  of  stimuli  up  to  about  five  ia  taken  into  account;  when  the  workload  is 
high,  the  number  of  previous  stimuli  which  ere  apparently  taken  into  account  by  S  decreases.  This  slope, 
which  can  be  reliably  shown  to  be  a  function  of  task  demands,  is  interpreted  as  an  index  of  the  reserve 
processing  capacity  of  S's  short-term  memory  buffer,  and  therefore  is  considered  a  useful  means  for  the 
assessment  of  workload  (Duncan-Johnson  and  Donchin  1977).  It  is  interesting  to  not"  that  this  effect  has 
been  demonstrated  experimentally  to  hold  over  a  range  of  intrinsic  probabilities  of  .10  to  .90  (Duncan- 
Johnson  and  Donchin  1977),  that  the  interval  between  probes  i’  not  of  critical  importance  (Donchin, 

McCarthy  and  Kutas  1977),  and  that  there  are  some  modality  differences  (auditory  vs  visual)  in  (1)  the 
way  workload  afreets  changes  in  P300  amplitude  and  (2)  in  the  operation  of  the  sequential  effect  model 
described  above.  Specifically  non-target  visual  stimuli  (i.e.,  visual  stimuli  to  which  need  not  respond) 
do  not,  apparently  follow  the  model  for  P300  amplitude  (Squires  et  al  1977).  Incidentally,  consideration 
of  the  use  of  the  sequence  effect  described  above  should  be  tempered  by  the  realization  that  this  effect 
disappears  when  the  knows  the  intrinsic  probabilities  in  the  situation.  Perhaps  an  implication  of  this, 
and  it  might  be  a  useful  one,  would  be  that  in  a  situation  of  known  probability  parameters,  a  rare  and 
important  stimulus  which  occurred  several  times  in  a  row  would  not  rapidly  become  habituated.  The 
boundaries  of  this  situation,  i.e.,  the  length  of  the  epoch  (in  which  the  postulated  rare  stimulus  occurred 
frequently)  which  would  be  necessary  before  J3  decided  that  this  event  had  now  become  frequent,  at  least 
temporarily  (as  opposed  to  perceiving,  simply  an  unusual  frequency  of  a  truly  rare  stimulus),  might  be 
interesting  as  an  experiment.  How  rapidly  does  a  rare  or  unusual  stimulus  habituate  and  what  factors 
affect  the  rate? 

In  gene^-1,  the  P300  relates  to  a  stimulus  through  changes  in  latency-which  reflect  the  speed  with 
which  the  stimulus  is  recognizcd-and  amplitude,  which  reflect  the  informational  value  and  task  relevance 
of  the  stimulus.  Informational  value,  or  feedback,  is  usually  operationally  represented  in  these 
experiments  by  stimulus  rarity;  task  relevance,  through  instructional  set.  In  general,  greater  task 
relevance  Increases  amplitude  and  greater  attentional  demands  (e.g.,  greater  display  complexity)  Increase 
latency.  The  stimulus  must  have  at  least  some  task  relevance  in  order  to  elicit  a  P300  at  all.  If  it  is 
an  unusual  or  rare  (low  probability)  stimulus,  and  therefore  a  surprising  stimulus  (the  degree  of  which 
may  be  affected  by  preceding  stimulus  patterns,  knowledge  of  intrinsic  probabilities,  and,  no  doubt, 
other  circumstances) ,  it  will  elicit  a  relatively  large  P300. 

intelligence,  motivation,  e.g.  -  any  or  all)  which  are  to  be  allocated  among  various  demands  at  any  given 
time.  A  computer-based  performance  enhancing  system  should  monitor  the  current  resource  allocation,  the 
additional  leftover  utilizable  capacity,  and  be  able  to  program  optimal  sharing  of  task  responsibility 
between  itself  and  the  human  operator.  Usually  the  primary  task  involves  tracking,  and  its  difficulty 
can  be  readily  manipulated  (e.g.,  by  varying  the  number  of  dimensions);  the  secondary  task  involves, 
usually,  counting  the  less  frequent  member  of  a  pair  of  tones  which  vary  in  pitch. 

Experiments  utilizing  this  paradigm  "indicate  that  while  first  order  ERP's  are  relatively  insensitive 
to  momentary  fluctuations  in  tracking  difficulty,  they  clearly  discriminate  between  low  levels  of  tracking 
demand  (no  tracking  vs.  one  dimensional  tracking).  Higher  levels  of  demand  (one  vs.  two  dimensions)  are 
differentiated  by  the  extent  of  sequential  processing  of  the  stimulus  series,  a  measure  that  is  similarly 
revealed  in  the  ERP's."  (Donchin  19/6).  Taken  together,  these  data  provide  the  basis  for  a  tentative 
measure  of  workload  (and,  more  importantly,  the  obverse:  reserve  resource  capacity)  based  upon  amplitude 
of  P30Q.  This  model  has  been  useful,  for  example,  in  experiments  (1)  showing  that  a  secondary  task 
requiring  an  occasional  button-press  interferes  with  performance  on  a  (difficult)  tracking  task  and  (2) 
studying  the  effect  when  a  third  task  is  added  to  ongoing  primary  and  secondary  tasks.  In  the  former 
case,  if  whatever  action  is  taken  as  a  result  of  the  button-push  is  taken  Instead  when  P300  occurs, 
hypothesizes  Donchin  (1976),  performance  on  the  tracking  task  would  show  less  interference  effect.  In 
the  latter  case,  the  slope  of  the  sequential  effect  (i.e.,  the  number  o*  previous  stimuli  which  affect 
P300  amplitude  to  the  auditory  probes  comprising  the  secondary  task)  determine  the  likelihood  that  the 
third  task  can  be  introduced  without  deterioration  of  performance  on  tasks  I  and  II.  It  should  be  noted 
that  performance  on  tasks  I  and  II  are  not  enhanced  in  this  model  (though  it  is  likely  that  information 
derived  therefrom  could  be  used  to  optimize  all-task  performance  by  changing  parameters  when  P300 
indicates  low  processing  capacity);  it  is  the  overall  system  effectiveness  which  benefits.  In  other  words, 
the  third  task  could  be  introduced  only  when  the  operator  is  cognitively  ready  and  able  to  handle  it. 

Another  experimental  paradigm  of  interest  is  the  comparison  of  (1)  th^  relationship  of  P300  latency 
to  reaction  time  in  speed-demand  situations  with  (2)  this  relationship  under  an  accuracy-demand  instruc¬ 
tional  set.  Under  speed  conditions  response  generally  precedes  P300;  under  accuracy  conditions,  reaction 
follows  P300.  Experimental  results  (Kutas,  McCarthy  and  Donchin  1977)  show  that  there  is  a  high  probabil¬ 
ity  that  if  an  error  has  been  made  under  speed  conditions,  reaction  time  is  less  P300  latency  in  a  paradigm 
where  j>  is  required  to  count  the  rare  stimuli,  and  that  making  use  of  this  knowledge  can  enhance  perfor¬ 
mance  under  the  speed  condition  to  the  level  attained  under  the  accuracy  condition  -  at  no  sacrifice  in 
speed. 


Thoie  are  other  experimental  methodological  ingenuities  which  appear  interestingly  relevant.  The 
Cooper  et  al  study  which  seems  to  elicit  brain  events  which  look  like  P300  in  quasl-veridical  settings 
has  already  been  discussed,  and  would  appear  to  offer  the  means  to  take  into  account  simultaneouoly 
information  to  be  gained  from  on-going  monitoring  of  background  EEG  with  naturally-occurring  ERP’s. 

Another  idea  vhich  may  be  promising  is  Regan's  (1977)  work  on  steady-state  ERP's.  Here  a  transient  change 
of  intensity  or  some  other  important  parameter  of  a  sensory  stimulus  is  repeated  to  elicit  a  series  of 
ERP's  which,  through  Fourier  transform  analysis,  can  be  used  to  monitor  cognitive  response  to  the  stimulus. 
Finally,  the  finding  by  Kutas  (dissertation  in  progress)  that  hemispheric  CNV  amplitude  indicates  intended 
choice  of  hand  with  which  a  response  is  to  be  made  (in  a  left-right  discrimination  task)  might  be 
eventually  put  to  good  operational  purpose. 

PROBLEMS  AND  APPLICATIONS 


It  may  be  useful  at  this  point  to  group  the  sorts  of  pilot  problems  which  may  lend  themselves  to 
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brain  wave  enhancement,  and  to  apeculate  briefly  as  to  some  of  the  waya  in  which  the  experimental  findings 
described  above  may,  eventually,  apply. 

WORKLOAD  ALLOCATION 


Workload  allocation  reflecting  accurate  continuous  assessment,  of  operator  state  and  processing 
capacity  probably  would  function  for  pilot  performance  enhancement  most  frequently  in  normal  flight,  to 
provide  the  earlier  mentioned  theoretically  optimum  mix  between  heman  and  machine  control  of  aircraft 
function.  Donchin  (1976)  describes  an  intended  experiment  which  ’bill  test  the  workload  allocation  model 
described  previously  and  which  could  serve  as  a  prototype  for  an  operational  aircraft  operation: 

The  function  of  an  autopilot  is  to  replace  the  human  as  an  active  element 
in  the  flight  control  loop.  The  reduction  in  pilot  work-load  gained 
when  the  aircraft  functions  in  an  autopilot  mode  is  however  purchased  at 
a  cost.  The  cost  is  represented  by  the  increasing  likelihood  of  a 
malfunction  in  the  control  system  with  the  greater  number  of  mechanical 
elements  in  the  loop,  the  loss  of  flexibility  of  the  system  in  responding 
to  unexpected  environmental  inputs,  and  the  deciease  in  speed  and  accuracy 
with  which  the  pilot  can  resume  comoand  if  such  events  occur  (e.g.,  the 
appearance  of  an  input  requiring  an  evasive  maneuver  or  implementation  of 
some  other  control  strategy) . 

Because  there  is  a  cost  associated  with  autopilot  control  based  upon 
actuarial  data  of  component  reliability  and  on  the  likelihood  of  external 
event  occurrence,  such  control  should  not  be  in  effect  al  all  times,  but 
only  when  the  external  (non-control)  demands  of  the  operator  reach  a 
sufficiently  high  level  that  the  cost  of  remaining  in  the  loop,  as 
determined  by  excessive  work-load,  exceeds  that  of  autopilot  function. 

This  experiment  will  explore  the  implementation  of  an  ERP-based  adaptive 
algorithm  that  will  shift  control  from  human  to  autopilot,  according 
to  inferences  based  upon  total  processing  demand  and  the  distribution  of 
processing  resources. 

Subjects  will  fly  a  predetermined  simulator  flight  path,  perturbed  by 
an  intermediate  level  of  turbulence.  External  demands  will  be  Imposed 
in  the  form  of  a  discrete  decision  making  task,  stimulating  target 
identification  and  classification.  Subjects  will  be  instructed  to 
monitor  a  display  for  discrete  items  of  alphanumeric  information  and 
identify  and  classify  this  information  into  one  of  four  categories 
via  a  manual  response. 

The  subject's  cognitive  state  will  be  monitored  by  three  kinds  of  ERP 
Inducing  stimuli.  The  requirement  to  process  a  Bernoulli  series  will 
generate  a  measure  of  overall  work-load.  ERPs  elicited  by  the 
appearance  of  the  peripheral  targets  will  be  analyzed  to  infer  the 
degree  of  processing  of  these  targets.  Finally,  probe  stimuli  delivered 
along  the  channels  conveying  flight  path  information  will  elicit  ERPs 
that  will  be  employed  to  make  inferences  concerning  processing  or 
neglect  of  the  axes  of  tracking  control. 

An  adaptive  algorithm  will  monitor  the  ERPs  and  make  the  decision  to 
implement  autopilot  aiding  according  to  the  following  decision  ti.ie 
If  work-load  is  inferred  to  be  high  and  and  tracking  axes  negi-ie*  ei!,  the 
autopilot  will  be  Implemented.  Otherwise  the  pilot  will  remain  ^n  the 
control  loop.  Once  the  pilot  is  out  of  the  loop,  the  decision  to 
de-actlvate  the  autopilot  will  be  made  if  the  level  of  work-load  and 
peripheral  target  frequency  both  drop  below  predetermined  criteria. 

Performance  of  the  adaptive  system  will  be  based  upon  a  Joint  measure 
of  target  identification  performance  (speed  and  accuracy),  and  tracking 
error,  Integrated  over  both  the  manually  controlled  and  autopilot 
deviations.  From  this  index  will  be  subtracted  a  fixed  cost  per  minute 
of  the  time  spent  in  the  autopilot  mode.  This  performance  index  will 
be  compared  with  that  achieved  in  a  regular  non-sdaptlve  session  of  the 
same  task  in  which  naturally  the  autopilot  cost  term  will  be  aero. 

Computer-controlled  workload  allocation  could  also  function  in  an  Important  beneficial  manner  under 
acutely  demanding  and  stressful  conditions,  applying  the  same  basic  algirlthm  in  a  slightly  different  way. 

warnings 


The  use  of  brain  waves  for  the  automated  enhancement  of  warning  effectiveness  could  occur  in  two 
ways.  A  computer  could  sense  some  deficit  in  an  operator's  state  of  being,  or  potential  deficit 
(anticipating  a  crisis),  or  the  computer  could  observe,  perhaps,  a  lack  of  attention  to  a  warning  display 
or  other  performance  (as  opposed  to  state)  deficit,  and  take  action  to  further  stimulate  the  human 
operator.  In  the  former  general  circumstance  the  system  monitor  would  make  inferences  about  Operator 
state,  probably  from  a  set  of  physiological  information  channels;  in  the  latter,  it  would  make  Inferences 
about  observed  deficits  in  operator  performance,  probably  from  assessment  of  ERF's  or  their  lack,  in 
response  to  warning  signal  displays  treated  functionally  for  this  purpose  as  probes.  The  potential  for 
use  of  brain  wave  indicators  of  dangerous  operator  state  or  behavior  is  doubtless  apparent:  presence 
of  theta  can  predict  drowsiness  and  deterioration  or  performance,  and  of  course  the  sleeping  state  can 
be  readily  detected.  The  detection  of  undesirable  levels  of  arousal  (inappropriately  high  or  low)  or 
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emotional  atataa  can  probably  be  enhanced  through  other  physiological  or  behavioral  channels.  Atten'.lon 
to  a  display  could  he  assessed  via  Regan's  (1977)  steady-state  ERF  technique  mentioned  earlier.  Donchln 
(1976)  holds  out  the  promise  of  use  of  ERF's  to  distinguish  non-response  to  a  warning  resulting  from  a 
purposeful  decision  to  Ignore  It,  from  accidental  non-recognition;  this  way  the  computer-based  system  can 
refrain  from  repetition  or  intensification  of  info  <  ion  which  the  operator  has  already  processed  and  to 
which  he  presumably  will  respond. 

SENSING  EVENTS  AND  SAVING  TIME 


The  computer  could  also  certainly  determine  the  occurrence  of  an  event  like  target  acquisition  (the 
Cooper  et  al  paradigm,  e.g.)  from  occulography ,  pupillometry,  and  brain  waves  more  rapidly  and  less 
dlsruptively  than  this  information  can  be  made  known  by  a  human  obrerver  employing  a  gross  skeletal 
response  such  as  pushing  a  button.  In  some  cases  it  may  even  provide  a  more  accurate  judgement  of  the 
task-relevant  event  than  could  the  button  push  (l.e.,  the  T300  latency  -  reaction  time  relationship  under 
speed  conditions  described  earlier).  In  any  event,  there  are  circumstances  wherein  a  few  mllleaeconds 
could  provide  an  important  advantage.  Use  of  information  about  the  laterality  of  the  readiness  potential 
to  (a)  infer  the  imminence  of  a  motor  response,  (b)  the  hand  with  which  it  will  be  made,  and  (c)  to 
execute  the  implied  command  (e.g.,  fire  a  weapon,  change  course  and/or  speed,  transmit  a  message)  or 
(d)  to  bring  a  control  device  to  the  desired  hand  (to  avoid  reaching),  might  significantly  affect 
performance  efficiency  -  especially  if  the  small  single  savings  in  time  and  effort  were  to  be  accumulated 
over  rapidly  successive  events  in  a  continuing,  recycling  context  of  swift  decision  making  and  response. 

FARTHER  ALONG 


At  some  point  in  the  development  of  more  highly  interactive,  closely-coupled  man-machine  Interfaces, 
a  serious  effort  should  be  made  to  develop  the  capacity  for  real  time  thought  commands.  In  this  mode 
the  computer  would  sense  specific  wishes  and  needs  (and  evaluations  of  the  adequacy  of  its  own  moment-to- 
moment  performance  in  meeting  these  needs)  on  the  part  of  che  operator.  Ultimately  the  ability  to  infer, 
accurately,  sequential  chunks  of  complex  information  would  be  needed,  utilising  electrical  representations 
of  verbal  or  non-verbal  cognitive  activity.  Chapman's  efforts  (1977)  to  locate  ERP's  related  to  specific 
words  in  multi-dimensional  semantic  space  via  the  semantic  differential  technique  may  be  a  promising 
step  toward  this  end. 

A  more  proximal  goal  would  be  the  development  of  machine  ability  to  sense  such  general  intangibles 
as  operator  uncertainty  (and  therefore  the  need  for  more  information,  or  at  least  a  need  to  maintain 
decision  options),  and  approval  or  disapproval.  Sensing  approval  or  disapproval  (for  want  of  a  better 
descriptor)  would  provide  instantaneous  qualitative  feedback  to  the  machine  somewhat  in  the  way  "warm- 
cold"  feedback  is  provided  to  the  blind  searcher  in  children's  games  -  or,  perhaps  more  appropriately, 
in  the  way  varying  intensities  of  temperature  guide  a  missile  toward  a  heat  source.  Ability  to  assess 
these  variables  continuously  and  sensitively  could  provide  the  basis  for  very  fine  control  of  machine 
by  man,  perhaps  even  allowing  the  creation  of  an  artifically  intelligent  servomechanism  so  closely 
responsive  in  real  time  to  its  operator's  cognitions  that  it  could  serve  virtually  sb  a  functional 
extension  of  the  operator's  own  nervous  system. 

The  ultimate  aim  for  this  type  of  man-machine  system  in  general,  and  a  goal  at  least  as  well  suited 
to  enhancement  of  aircraft  pilot  performance  as  to  any  other  military  application,  is  the  utilization 
of  the  human  operator  for  those  purposes  for  which  he  is  uniquely  qualified:  as  a  complex  pattern 
recognizer  and  decision  maker  -  the  pure  strategist  or  tactician.  The  computer  would,  in  real  time  and 
functionally  as  if  an  organic  part  of  the  operator,  undertake  such  activities  as  storing,  organizing, 
and  retrieving  data  base  information  as  it  is  acquired  or  needed,  performing  other  data  handling  functions, 
and  carrying  out  decisions  once  it  can  accurately  determine  that  they  have  been  made. 

CURRENT  PROBLEMS  AND  NEEDED  RESEARCH 


Aside  from  the  obvious,  which  is  to  increase  the  certainty  with  which  inferences  from  P300  "an  be 
made  and  to  refine  the  methodology  of  making  use  of  them,  it  is  possible  to  describe  several  areas  where 
current  research  techniques  need  to  be  made  more  powerful,  and  new  methods  which  have  been  identified 
and  require  further  development. 

Virtually  total  reliance  on  P300  for  access  to  cognitive  events  is  too  limiting,  and  there  are 
several  ways  in  which  more  brain  Information  might  be  made  available.  One  line  of  attack  into  this 
problem  area  would  be  to  :  ;ek  to  understand  the  events  underlying  other  components.  Vidal  (1977),  for 
example,  has  made  potentially  interesting  use  of  some  of  the  early  exogenous  components  to  guide  a  cursor. 
Also,  there  is  some  indication  that  P300  bandwidth  might  be  increased  by  independent  probing  of  isolated 
sectors  of  each  retinal  field.  Further,  Donchin  (1976)  has  suggested  some  conditions  under  which  N100 
(an  exogenous  component)  and  N200  might  yield  operationally  useful  information  on  attentional,  perceptual, 
and  processing  events.  Another  line  of  investigation  would  be  to  open  up  new  sources  of  information. 

For  example,  using  multiple  arrays  of  electrodes,  latency,  and  amplitude  differences  arising  from  different 
sites  might  reflect  distinct  cognitive  activities.  Further,  regional  variations  in  latency  and  amplitude 
of  the  same  component  might  be  related  in  stable  ways  to  various  cognitive  activities.  Development  of 
functions  representing,  say,  ratios  of  P300  amplitudes  at  various  locations  combined  with  concurrent  sets 
of  latency  difitfrences,  or  other  secondary  treatments  of  multiple  recordings  of  the  same  event,  might 
yield  fine  discriminations  among  processing  stages  or  other  relevant  aspects.  Also,  the  development  of 
magnetoencephalography  would  allow  access  to  subsurface  activity  (as  well  as  providing  physically  non- 
impinging  sources) . 

Even  better  and  more  reliable  single-trial  identification  of  brain  wave  events  of  interest  is  needed. 
While  the  development  of  Donchin  and  his  colleagues  (e.g..  Squire  &  Donchin  1976)  of  such  a  capability, 
using  a  sliding  template  and  stepwise  discrimination  analysis  has  made  feasible  the  real  time  use  of  ERF's 
for  vehicle  control,  much  more  needs  to  be  known  about  such  issues  as  the  stability  of  individual  templates 
for  recognition  of  ERP's,  optimum  strategies  for  updating  these  templates,  and,  of  course,  increasing  the 
accuracy  with  which  such  recognitions  are  made.  Present  capabilities  seem  remarkably  good,  but  if  life- 
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or-death  actions  are  to  be  taken  on  the  basis  of  then,  either  accuracy  must  be  improved  or  a  fail-safe 
procedures  developed.  For  allocation  of  workload  under  routine  or  even  somewhat  demanding  conditions 
present  levels  of  accuracy  seen  adequate. 
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Figure  1 


.  Diagram  of  Teichner's  Theoretical  System  (after  Teichner,  1974). 
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Figure  2.  Average  rates  of  effective  ormation  reduction  on  Sternberg  task 
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INTRODUCTION 

The  assessment:  of  ;>ilot  workload  la  a  special  case  of  the  meaaursmant  of  information-processing  load, 
the  aggregated  demands  r laced  upon  an  individual  in  the  performance  of  a  particular  cognitive  task  or 
function.  Three  general  approaches  have  bean  employed  in  the  measurement  of  lnformatlon-prncesslng  load. 

The  first  is  that  of  subjective  estimation.  Subjective  estimates  are  involved  when  workload  la  estimated 
from  the  task  engineer's  opinion  as  to  the  provable  magnitude  of  processing  load,  an  opinion  that  may  be 
based  on  previous  experience  or  an  analytic  theory.  However,  subjective  estimates  of  workload  by  the  user 
or  particlpar..  are  the  moat  common  form  of  vorkload  measurement  in  aircraft  design.  Both  types  of 
subjective  ratings  have  sotious  weaknesses. 

The  second  major  method  of  measuring  processing  load  employs  behavioral  measurement.  Here  the  notion 
is  that  the  information-processing  capacity  of  an  individual  is  limited  ao  that  the  workload  imposed  by 
one  task  can  be  estimated  by  the  degree  co  which  it  interferes  with  the  simultaneous  execution  of  a 
secondary  measurement  task,  such  as  simple  reectlon  time  or  manual  tracking.  This  method  has  much  to 
reconnend  it  over  the  subjective  measurement  techniques,  particularly  with  respect  to  objectivity.  But  the 
behavioral-interference  method  is  difficult  and  time- consuming  to  implement,  and  yields  relatively  little 
data  for  the  amount  of  time  and  energy  invested  in  testing.  As  a  consequence,  this  method  has  been  of 
more  theoretical  than  applied  interest. 

The  third  major  method  is  physiological,  in  which  the  response  of  the  nervous  system  to  the  load 
imposed  by  an  information-processing  task  is  assessed.  Momentary  increases  in  processing  load  induce 
short-latency,  short-lived  increases  in  measures  of  central  nervous  system  activation.  These  changes  are 
most  evident  and  most  easily  measured  in  the  autonomic  nervous  system.  Among  the  autonomic  measures  of 
activation,  changes  in  pupillary  diameter  appear  to  be  the  most  sensitive  and  accurate  (Kahneman,  Tursky, 
Shapiro  A  Crider,  1969). 

This  paper  discusses  the  use  of  pupillometric  measures  in  the  evaluation  >f  pilot  woikload.  I  begin 
by  describing  the  innervation  of  the  pupil  with  respect  to  its  connections  with  brainstem  activation 
systems.  Modern  methods  for  pupillometric  measurement  are  then  described.  Next,  a  series  of  experiments 
describing  pupillary  response  in  a  variety  of  information-processing  tasks  is  reviewed.  Finally  some 
possibilities  for  the  use  of  pupillometric  methods  in  the  measurement  of  pilot  workload  are  discussed. 

INNERVATION  OF  THE  PUPIL 

Pupillary  diameter  is  determined  by  the  relative  statr-  of  contraction  of  the  two  opposing  muscle 
groups  of  the  iris,  the  sphincter  and  the  dilator  pupillae.  The  dilator  pupillae  arc  radially  oriented 
bands  of  smooth  muscle  that  are  innervated  by  the  sympathetic  branch  of  the  dutonomlc  nervous  system 
through  the  cervical  sympathetic  ganglia.  The  sphincter  pupillae  are  innervated  by  the  parasympathetic 
system  through  the  ciliary  ganglia,  and  act  tc  close  cbt  pupil  when  activated.  Pupillary  dilation, 
therefore,  can  result  from  either  sympathetic  activation  or  parasympathetic  inhibition.  Cortical  inhibi¬ 
tion  of  the  Edingcr-Westphal  nucleus,  the  brainstem  nucleus  that  projects  to  the  ciliary  ganglia,  has 
been  frequently  hypothesized  to  accompany  cortical  activation.  Both  the  sympathetic  and  parasympathetic 
brainstem  nuclei  involved  the  regulation  of  the  iris  musculature  are  Intimately  connected  with  the 
reticular  activating  system  Indeed,  pupillary  measures  were  used  to  assess  reticular  formation  functions 
in  the  pioneering  work  of  Moruzzi  and  Villablanca  (Moruszl,  1972). 

METHODS  FOR  PUPILLCMETRIC  MEASUKEM7IJT 

The  pupillometric  measurement  of  information-processing  load  requires  that  accurate  measures  of 
pupillary  diameter  bo  obtained  during  the  course  of  an  information-processing  task.  Early  work  in  this 
area  employed  photographic  methods,  in  whic’.  infrared  photographs  of  the  pupil  were  obtained  at  the  rate 
of  1  or  2  pet  second  for  the  duration  of  the  processing  task.  Pupillary  diameter  vis  determined  by 
subsequent  direct  measurements  taken  from  the  enlarged  Image  of  each  frame.  Although  accurate,  the 
photographic  method  suffers  from  two  veaknesses:  it  is  a  laborious  and  frustrating  procedure  to  implement 
and,  for  this  reason,  it  is  not  practical  to  obtain  fine  temporal  resolution  because  of  the  resulting 
proliferation  of  photogrspha. 

The  second  principal  method  of  pupillometric  measurement  involves  the  use  of  a  high-resolution  infrared 
video  camera  and  a  special-purpose  image  processor  that  extracts  an  estimate  of  pupillary  diameter  from 
each  frame  of  the  video  image.  Originally  developed  under  a  grant  from  NIH,  this  instrument  is  presently 
manufactured  by  Gulf  and  Western  Applied  Science  Laboratories,  formerly  the  Whittaker  Corporation.  All 
major  laboratories  now  involved  iu  pupillometric  research  use  this  Instrument. 

In  the  basic  video  scan  puplllometer  tha  subject's  head  la  restrained  by  a  chin  and  forehead  support. 

At.  infrared  video  camera  is  placed  outside  the  subject's  foveal  field  of  vision,  ss  is  an  infra-red 
slit-lamp  illuminator.  Both  the  illuminator  and  the  camera  are  focused  on  one  of  the  subject's  eyes  and 
the  resulting  image  is  aent  to  the  image  processor  for  extraction  of  pupillary  diameter.  This  basic 
configuration  of  head  rest,  illuminator  and  camera  is  adequate  for  most  experimental  work.  Pupillary 
measurements  say  be  mads  over  a  wide  range  of  lighting  conditions,  including  complete  darkness.  The 
subject  is  free  to  move  his  gaze  over  a  limited  portion  of  tha  visual  field;  strict  fixation  la  not 
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required.  Positioning  of  the  aubjact  In  tha  head  aupport  la  quickly  accomplished.  With  appropriate 
adjustments  thla  teatlng  arrangaaent  raaulta  In  little  aubjact  fatigue. 

For  purpoaaa  requiring  greater  freedoa  of  head  aoveaent  than  allowed  under  thla  configuration,  a 
head-tracking  pupllloaeter  wy  be  uaad.  Thla  paraita  recording  of  pupillary  dlaaeter  froa  a  aaated 
aubject  with  coaplete  freedra  of  head  movement.  In  thla  device,  two  video  caaeraa  are  smployed.  The 
aecond  caaera  la  uaed  to  locate  the  head  of  the  aubject  In  thrae-dlaenaional  apace  and  by  tha  uae  of 
aervo-aechaniaaa  direct  tha  prlaary  caaera  to  the  subject's  pupil  in  that  apace.  Although  rather 
expensive,  thla  head-tracking  arrangaaent  aeaaa  to  parfora  quite  reliably. 

In  the  baaic  pupllloaeter,  pupillary  dlaaeter  la  eatiaated  froa  tha  video  iaage  of  the  eye  by  the 
following  method:  Each  raeter  line  of  the  iaage  la  first  scanned  for  aharp  light/dark  contrast  points 
that  eight  signal  the  boundary  between  irle  and  pupil.  Tha  use  of  an  infrared  vidlcon  Minimises  the 
effects  of  iris  coloration  on  tha  contraat  of  the  lrie-pupll  boundary.  A  single  control  la  provided  for 
the  adjustment  of  the  sensitivity  of  the  contraat  detection  circuitry.  Sensitivity  Is  individually 
adjusted  for  each  aubject  but,  once  adjuatad,  remains  stable  over  long  periods. 

The  second  atage  of  Imaging  processing  Is  the  search  for  a  semicircle  of  contraat  points  which 
together  define  the  leading  edge  of  the  pupil.  The  diameter  of  this  semicircle  provides  a  reliable 
estimate  of  pupillary  diameter.  This  measure  la  recomputed  30  times  each  second  and  Is  available  for 
computer  input  in  either  analog  or  digital  form. 

The  performance  of  the  iaage  processor  may  be  evaluated  by  means  of  a  video  display  of  the  proceeaed 
iaage.  Contrast  points  are  Indicated  on  the  monitor  as  brightness-intensified  sparkles.  The  extracted 
Image  of  the  pupil  is  visually  Indicated  by  a  darkening  of  all  raster  lines  passing  through  the  detected 
pupil.  Thus,  If  the  pupllloamter  Is  functioning  properly,  the  monitor  displays  a  video  Image  of  an  eye, 
with  intensified  points  along  the  left  iris-pupil  boundary  with  a  dark  band  tangent  to  the  upper  and 
lower  boundaries  of  the  pupil.  Measurement  quality  can  be  assured  by  visual  monitoring  of  the  processor's 
display. 

Recording  Constraints:  Accurate  recording  is  essential  to  pupilloaetrlc  measurement  of  workload  since 
the  pupillary  dilations  reflecting  changes  in  central  activation,  although  highly  reliable  and  observable 
on  single  trials,  are  nonetheless  exceedingly  small.  For  this  reason,  other  factors  which  affect  pupillary 
diameter  must  be  carefully  controlled. 

Chief  among  the  non-cognitive  determinants  of  pupillary  diameter  is  the  well-known  light  reflex, 
which  reduces  pupillary  diameter  as  integrated  retinal  Illumination  is  increased.  The  light  reflex  is 
very  sensitive  and  the  maximum  amplitude  of  the  response  is  several  millimeter?.  For  this  reason  the 
luminance  of  the  visual  field  must  be  constant  during  measurement.  In  out  experiments  on  visual  infor¬ 
mation  processing,  we  employed  a  computer-controlled  CRT  display  in  which  task-relevant  stimuli  were 
presented  for  short  (100-200  msec)  periods.  At  all  other  times  equlluminance  random  dot  fields  were 
displayed.  Such  control  of  the  light  reflax  may  not  be  possible  if  the  subject  is  required  to  scun  a 
complex  visual  field  of  varying  luminance. 

The  momentary  state  of  the  occulomotor  reflexes  mediating  convergence  and  accommodation  also  must  be 
controlled  as  vergeace  movements  and  accommodation  reflexlvely  affect  pupillary  diameter.  In  our  work 
with  visual  displays,  the  critical  visual  stimulus  was  placed  several  meters  from  the  subject  to  relax 
accommodation  and  minimize  convergence.  At  one  time  we  were  troubled  with  significant  constrictions 
occurring  In  some  subjects  while  viewing  prolonged  visual  displays.  We  attributed  these  artifacts  to 
uncontrolled  vergence/accummodatlve  movements  and  altered  our  task  to  utilize  the  more  artifact-free 
brief  presentations.  Nonetheless,  visual  stimuli  can  be  employed  in  pupillometrlc  research,  but  a  great 
deal  of  care  must  be  taken  in  dealing  with  such  materials. 

These  problems  do  not  exist  when  auditory  displays  are  employed.  For  this  reason,  presentation  of 
information  in  the  auditory  mode  is  recommended  whenever  feasible. 

Recording  Artifacts:  The  video-scan  pupillouetar  is  one  of  the  most  accurate  reliable,  and  trouble-free 
phychophyolological  recording  devices  ever  developed.  Nonetheless  artifacts  in  the  pupillometrlc  record 
do  occur  and  must  be  dealt  with  before  the  data  are  analyzed. 

The  major  sources  of  artifact  are  blinks  and  partial  lid  closures.  In  these  cases,  movements  ol 
the  eyelid  obscure  a  portion  of  the  pupil,  resulting  in  i  -uieous  measurement.  Such  artifacts  are  easily 
observed  in  the  pupillary  record  and  are  sufficiently  ob  .~as  to  permit  automatic  computer  artifact 
detection  if  desired.  In  our  own  work,  the  raw  pupillary  data  from  a  entire  experimental  session  is 
stored  on  disk  memory  for  later  visual  examination,  bunll  artifacts  are  corrected  by  linear  interpolation 
and  data  segments  with  large  artifacts  are  discarded.  This  editing  procedure  is  rapid  and  assures  accurate 
pupillooecrlc  data. 

Another  major  source  of  artifact  lies  in  the  contrast  detection  threshold  established  for  certain 
subjects.  If  the  Illuminator  is  improperly  focused,  or  if  the  subject  has  long  drooping  eyelashes,  the 
recognition  of  the  upper  pupil  boundary  may  be  uncertain.  This  results  in  a  characteristic  jitter  in  the 
pupillary  record.  When  this  occurs  the  source  of  the  difficulty  should  be  corrected.  Data  segments 
containing  such  jitter  should  be  discarded. 

PUPILLARY  CHANGES  IN  HUMAN  INFORMATION  PROCESSING 

There  is  a  large  body  of  experimental  evidence  that  suggests  that  pupillary  dilations  under  controlled 
conditions  reflect  with  high  accuracy  the  momentary  level  of  load  placed  upon  the  human  nervous  system  by 
information  processing  tasks  of  varying  difficulty,  content  and  complexity.  These  data  have  been  reviewed 
recently  by  Goldwarer  (1972)  and  Janisae  (1974).  In  this  section  I  shall  present  a  series  of  experiments, 
primarily  from  my  own  laboratory,  that  illustrate  the  senaitlvity  of  pupillometrlc  measures  to  momentary 
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changes  in  processing  load  which  are  in  acme  dagraa  relevant  to  tha  appllad  problem  of  pitot  workload 
avaluatlon. 

Memory  Load ;  Ona  component  of  pilot  workload  ia  tha  demand  placad  upon  ahort-tara  memory  in  varbal 
coaninlcation  with  other  aircraft  or  ground  aitaa.  Detailed  varbal  lnatructiona  for  example,  auat  ba 
accurataly  ratainad.  Tha  llaltatlona  of  ahort-tara  memory  ara  wall  known  to  psychologlata  and  human 
factora  anginaara  allka.  Pupllloautrlc  aaaauraa  provlda  a  aaana  of  quantitatively  aaaaaalng  tha 
phyaiological  1-ad  placad  upon  an  individual  by  varbal  inforaantion  of  varying  aaounta  and  coaplaxlcy 
which  la  to  ba  ratainad  for  ahort  parioda  of  tiaa. 

Kahneman  and  Baatty  (1966)  praaantad  tha  firat  puplllometric  analyala  of  tha  procaaalng  demands 
ancountarad  In  a  ahort-tara  aaaory  taak.  Figure  1  praaanta  pupilloaatrlc  racorda  obtained  during  a 
ahort-tara  aaaory  taak  in  which  etringa  of  3  to  7  digite  ware  auditorily  praaantad  at  tha  rate  of  1  par 
aec.  Two  eeconda  after  tha  laat  digit  waa  heart,  aubjacta  were  required  to  repeat  the  digit  atrlng  at 
tha  aaaa  rata.  It  ia  apparent  from  Figure  1  that  the  momentary  degree  of  pupillary  dilation  accurately 
reflecte  tha  cognitive  workload  imposed  by  tha  ahort-tara  aaaory  taak.  Pupillary  diameter  lncreaaae  in 
a  linear  faahlon  with  tha  praaantation  of  each  digit,  reaching  the  maxiaua  in  the  2-anc  pauaa  preceding 
report.  Aa  digite  are  unloaded  froa  aaaory  during  report,  pupillary  dlaaater  decreaaae  with  each  digit 
reported,  reaching  baaaiine  lavala  aftar  report  of  the  final  digit.  In  unpubliahed  work,  it  waa  deter¬ 
mined  that  if  tha  aubjact  ware  raquaatad  to  repeat  tha  atrlng  a  aacond  tiaa  immediately  aftar  reporting 
tha  final  digit,  the  pupil  immediately  dilatea  to  the  peak  diameter  for  that  atrlng  and  than  decreaaea 
with  each  digit  epoken  until  tha  entire  atrlng  haa  been  reported  for  the  aecond  time.  The  magnitude  of 
the  pupillary  dilation  at  tha  pauaa  batwaan  input  and  output  in  Figure  1  la  an  lncreaelng  function  o £ 
atrlng  length.  Baatty  and  Kahneman  (1966)  demonatrated  that  a  similar  pupillary  function  la  obtained 
when  a  string  of  item*  is  recalled  from  long-term  memory  for  report:  On  requeat  to  report,  a  large 
pupillary  dilation  ia  observed  as  information  la  retrieved  from  long-term  memory  (aee  Figure  2) .  As  each 
digit  in  the  atrlng  ia  reported,  pupillary  diameter  decreaaea,  reaching  baseline  lavela  at  report  of  the 
last  digit.  Thus  it  appears  that  the  limited  capacity  portion  of  the  human  lnformatlon-procecsing  system 
may  be  loaded  from  either  long  term  memory  or  environmental  stimuli  and  that  tha  puplllanetrlcally  mea¬ 
sured  workload  is  similar  in  both  of  these  cases. 

Memory  load  la  also  determined  by  the  difficulty  of  the  to-be-remembered  information.  Remembering 
unrelated  nouns  requires  more  capacity  than  remembering  a  string  of  single  digits  of  equal  length,  as 
measured  by  the  difference  in  memory  span  for  the  two  types  of  items.  Figure  3  shows  the  puplllometr lc 
data  obtained  for  strings  of  four  items  of  different  „ypes.  The  smallest  dilations  are  observed  for 
strings  of  four  digits  that  were  to  be  simply  repeated.  Larger  dilations  were  apparent  for  the  string 
of  four  words,  indicating  that  both  item  difficulty  and  number  of  items  determine  workload  in  the  memory 
task.  The  largest  dilations  vere  obtained  for  the  subjectively  most  difficult  task  of  transforming  each 
of  the  four  digits  by  adding  one  before  report.  These  data  provide  strong  support  for  the  idea  that  task- 
induced  pupillary  dilations  provide  a  physiological  index  of  the  momentary  level  of  workload  imposed  by  a 
memory  task. 

This  idea  was  subsequently  confirmed  in  an  experiment  by  Kahneman,  Beatty,  and  Pollock  (1967)  in 
which  both  puplllometric  and  behavioral  interference  methods  were  utilized  to  assess  workload  in  the 
four-digit  add-one  memory  transformation  task.  Using  a  secondary  task  of  visual  target  detection,  it  was 
found  that  the  behavioral  eatlmate  of  workload  and  the  puplllometric  measure  of  physiological  load  were 
in  exact  agreement.  A  series  of  controls  ruled  out  any  peripheral  interference  of  the  pupillary  dilations 
then  elves  on  performance  of  the  secondary  task.  In  comparing  the  two  data  sets,  the  puplllometric  data 
was  ,or  more  detailed  than  the  behavioral  data,  required  fever  trials  to  obtain,  and  was  of  considerably 
lower  variance. 

Decision  Procoages:  Even  simple  decision  processes  appear  to  impose  some  workload  on  the  cognitive  system 
as  indicated  by  puplllometric  measures  of  activation.  For  example,  Simpson  and  Hale  (1969)  measured 
pupillary  diameter  in  two  groups  of  subjects  who  were  required  to  move  a  level  to  one  of  four  positions. 

In  the  decision  group,  subjects  were  told  at  the  beginning  of  each  trial  that  either  of  two  directions 
was  permissible  (e.g.,  front  or  left).  Seven  seconds  later  a  response  cue  was  presented  and  the  subject 
initiated  one  of  the  two  movements.  In  the  no-decision  control  group,  subjects  were  instructed  exactly 
as  to  the  desired  movement  on  each  trial  (e.g.,  front).  Pupillary  dilation  in  the  post-instruction  pre¬ 
response  period  was  larger  and  more  prolonged  for  thos  subjects  who  had  to  chooBe  between  two  movements 
before  responding. 

Substantially  larger  pupillary  dilations  are  observed  to  accompany  more  difficult  decision  processes. 
In  an  experiment  reported  by  Kahneman  and  Beatty  (1967),  listeners  were  required  to  determine  whether  a 
comparison  tone  was  of  higher  or  lower  pitch  than  the  standard.  Clear  pupillary  dilation  occurred  in  the 
4-second  decision  period  between  presentation  of  the  comparison  tone  and  the  response  cue.  The  amplitude 
of  this  dilation  varied  as  a  direct  function  of  decision  difficulty,  the  difference  in  frequency  between 
the  standard  (350  Hz)  and  comparison  tones.  This  relation  la  shovn  in  Figure  4,  which  presents  both  the 
amplitude  of  dilation  in  the  decision  period  and  the  percent  decision  errors  as  a  function  of  the  fre¬ 
quency  of  the  comparison  tone.  These  dilations  vere  highly  reliable  and  did  noc  habituate  over  the 
experimental  session.  Pupillary  dilations  during  decision  appear  to  vary  as  a  function  of  cognitive 
workload,  as  Inferred  from  task  parameters  and  performance  data. 

Complex  Reasoning:  More  complex  cognitive  functions  not  unexpectedly  impose  a  major  load  upon  the  human 
nervous  system  during  their  execution.  This  may  be  most  easily  observed  in  the  laboratory  using  mental 
arithmetic  tasks.  Such  tasks  may  be  regarded  aa  directly  analogous  to  other  types  of  complex  reasoning 
tasks  that  may  occur  lu  man/machine  interactions. 

Pupillary  dilations  accompanying  complex  problem  solving  appear  to  be  related  directly  to  the 
difficulty  of  such  processing,  although  behavioral  assessments  of  vorkload  have  not  yet  appeared  for  these 
types  of  cognitive  tasks.  For  example,  Hess  and  Polt  (1964)  examined  pupillary  movements  as  multiplication 
problems  were  solved  mentally.  Pupillary  diameter  increased  during  the  period  preceding  solution,  and 
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related  to  presumed  problem  difficulty.  Payne,  Perry,  end  Harasyaiw  (1968)  eleo  report  a  monotomic 
relation  between  mean  pupillary  diameter  and  problea  difficulty,  but  note  that  thla  relationship  le 
markedly  nonlinear  with  reapect  to  difficulty  scales  baaed  upon  percent  correct  solution,  time  to 
solution  or  subjective  rating  of  difficulty.  Pupillary  diameter  in  mental  multiplication  appears  to  peak 
rapidly  an  a  function  of  difficulty,  with  more  difficult  problems  requiring  more  time  until  solution  la 
reached.  Thla  suggeats  that  cognitive  capacity  la  quite  fully  taxed  in  complex  mental  arithmetic  problems 
so  that  the  workload  per  unit  time  reaalns  relatively  constant  aa  problem  difficulty  la  increased  over 
moderate  levels,  but  that  the  total  tlae  to  solution  ia  increased. 

These  Investigations  using  the  older  photographic  methods  of  puplllometrlc  measurement  were  not  able 
to  discern  the  fine  temporal  structure  of  complex  reasoning  tasks  which  la  clearly  evident  whan  more 
de  lied  video-acan  pupllloaetry  is  employed.  Ahern  and  Beatty  (in  preparation),  as  part  of  a  study  of 
Individual  differences  and  cognitive  load,  presented  subjects  with  multiplication  problems  at  three  levels 
of  difficulty.  The  problems  were  computer-controlled  using  acoustically-presented  digitised  speech 
stimuli.  These  data  are  summarised  in  Figure  5.  Clear  dilations  may  be  observed  in  all  cases  at  the 

presentation  of  the  multipllcaut  (a  elngle  digit,  a  low  two  digit  numoer  or  a  high  three  digit  number). 

Thia  dilation  quickly  subsides  and  the  pupil  returns  towards  basal  levels  until  the  multiplier  is  presented, 
at  which  point  a  major  dilation  la  observed.  The  duration  of  this  dilation  is  related  to  problem  difficulty 
being  more  prolonged  for  more  diwlcult  problems.  These  data  suggest  that  puplllometrlc  methods  nut  only 
may  serve  to  measure  the  workload  associated  with  a  single  task  or  function,  but  also  to  measure  the  time 
course  of  that  load  with  some  degree  of  precision. 

Other  types  of  complex  problem  solving  tasks  show  similar  relationships  between  pupillary  dilation  and 

problem  difficulty.  For  example,  Bradshaw  (1968)  has  reported  that  larger  pupillary  dilations  accompany 

the  solving  of  more  difficult  anagrams,  and  that  these  dilations  are  maintained  until  solution  is  reached. 

Summary:  Puplllometrlc  measurements  have  now  been  obtained  in  a  variety  of  simple  information-processing 
tasks  under  laboratory  conditions.  They  appear  uniquely  sensitive  to  subtle  differences  in  processing 
load  obtained  in  these  tasks.  Processing  load  appears  to  increase  the  activation  of  brainstem  arousal 
systems  in  measured  amounts.  These  activation  responses  are  of  short  duration,  of  an  extent  that 
accurately  reflects  load,  and  occur  at  short  latency.  The  responses  do  not  habituate,  and  therfore  may 
be  assumed  to  reflect  a  fundamental  physiological  response  to  increase  in  cognitive  workload.  As  such, 
they  suggest  an  alternative  to  traditional  methods  of  quantifying  workload,  a  possibility  that  is 
explored  in  the  following  section  of  this  repor*. 

FUTURE  APPLICATIONS  OF  PHYSIOLOGICAL  MEASURES  TO  THE  ASSESSMENT  OF  PILOT  WORKLOAD 

No  investigation  has  yet  been  published  in  which  puplllometrlc  methods  have  been  employed  in  the 
measurement  of  pilot  workload.  Perhaps  the  most  direct  application  of  these  methods  to  practical  perform¬ 
ance  assessment  is  Peavler'a  (1974)  uae  of  puplllometrlc  measures  to  assess  fatigue  in  telephone  operators 
after  working  full  shifts  on  different  types  of  computer-based  information  retrieval  systems.  Peavler 
found  that  the  more  automated  methou,  which  was  both  more  efficient  and  more  taxing,  resulted  in  greater 
operator  fatigue,  as  Indexed  by  mean  decrease  in  pupillary  diameter  from  preraek  to  posttask  measurements. 
Thus,  Peavler  waa  not  concerned  with  the  question  of  task-induced  pupillary  dilations  and  Instantaneous 
workload  levels,  the  topic  of  the  present  report. 

Tha  body  of  research  summarized  above  certainly  makes  a  theoretical  contribution  to  the  study  of 
workload,  suggesting  that  workload  can  be  measured  by  a  physiological  response  to  task  load,  rather  than 
by  behavioral  interference  or  subjective  report,  in  my  opinion,  these  methods  may  be  of  practical  conse- 
sequence  aa  well. 

The  most  natural  application  to  the  problem  of  pilot  workload  would  seem  to  be  in  the  area  of  design 
of  equipment  and  pilot  procedures,  in  which  the  workload  parameters  of  each  of  several  design  options 
might  be  assessed  separately  using  experimental  methods  similar  to  those  outlined  above.  Here,  one  might 
ask  questions  concerning  optimal  information  formatting  to  determine  a  communication  structure  that 
minimizes  operator  load.  The  method  is  particularly  well  suited  for  the  design  of  the  more  cognitive  com¬ 
ponents  of  the  pilot 1 8  task,  analogous  to  the  mental  arithmetic  experiments  described  above.  It  is 
precisely  this  aspect  that  would  seem  to  be  most  difficult  to  measure  by  conventional  workload  assessment 
procedures. 

One  could  conceivably  construct  a  simulator  in  vhlch  puplllometrlc  measurements  might  be  made  to  tast 
workload  in  a  more  realistic  envlronmt  c.  However,  in  my  opinion,  the  problems  of  adequate  control  of 
visual  input  in  such  a  situation  would  seriously  Impede  its  usefulness.  As  mentioned  above,  strict  control 
of  visual  input  is  necessary  for  the  puplllometrlc  measurement  of  workload  as  the  large  magnitude  changes 
in  pupillary  diameter  that  are  produced  during  a  visual  scan  of  a  non- Homogeneous  visual  field  Introduce 
serious  artifacts  in  the  puplllometrlc  record.  Until  such  problems  are  solved,  the  use  of  pupillometry 
in  more  natural  environments  will  be  restricted  at  best. 

Finally,  some  attention  should  be  paid  to  the  use  of  other  physiological  measures  su.  i  as  the  EEG 
in  the  assessment  of  workload  effects.  An  inspection  of  the  current  literature  is  not  premising  in  this 
regard,  as  no  large  magnitude  and  robust  relations  between  EEG  and  workload  have  been  reported  despite 
a  reasonable  amount  of  experimental  work  devoted  to  this  problem.  The  development  of  an  EEG  measure  of 
workload  would  be  of  some  practical  interest,  as  the  EEG  is  not  dependent  on  small  changes  in  visual  input 
as  la  the  pupil.  The  question  of  an  EEG  measure  of  workload  is  presently  being  pursued  in  my  laboratory 
under  ONR  support.  We  ore  using  the  mental  arithmetic  and  short-term  memory  tasks  which  have  such  strong 
and  reliable  effects  on  autonomic  indicators  of  load,  including  the  pupil.  Puplllometrlc  data  are  also 
being  analyzed.  EEG  data  is  being  systematically  recorded  from  each  of  the  19  sites  in  the  Ten-Twenty 
recording  system  (Jasper,  1958)  and  stored  for  subsequent  analysis.  By  proceeding  in  a  systematic  manner 
in  the  analysis  of  the  EEG  and  Ky  continuing  use  of  the  puplllometrlc  Measures  to  assess  the  effectiveness 
of  the  manipulations  of  processing  load,  we  hope  to  finally  discei  i-  the  central  signs  of  processing  load 
which  are  so  clearly  observable  in  the  autonomic  periphery. 
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In  k  sai-y,  physiological  methods  provide  •  unique  alternative  to  the  traditional,  but  In  various 
nays  unsatisfactory,  methods  of  workload  measurement,  subjective  estimation  and  behavioral  Interference 
with  a  secondary  task.  Of  the  physiological  measures,  the  task-evoked  pupillary  responses  provide  the 
clearest  Indication  of  both  the  degree  of  load  Imposed  by  a  particular  task  or  function  and  the  fluctu¬ 
ations  of  that  load  over  time.  Although  soue  restrictions  are  necessary  to  Insure  accurate  pupillometric 
recordings,  the  use  of  pupillometric  methods  for  workload  assessment  would  seem  to  be  feasible,  particu¬ 
larly  in  evaluating  the  load  Imposed  by  complex  cognitive  tasks. 
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Figure  1.  Average  pupillary  diameter  during  preseitation  and  recall  of  strings  of  3  to  7  digits, 

superimposed  about  the  two  second  pause  between  presentation  and  recall.  Slashes  indicate 
the  beginning  and  the  end  of  the  memory  task. 
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AVERAGE  PUPIL  DIAMETER 


Figure  2. 


Figure 


Average  pupil  diameter  for  five  subjects  during  presentation  and  report  of 
seven-digit  telephone  numbers  from  short-term  and  long-term  memory.  The 
long-term  memory  function  is  broken  above  the  brace,  with  both  points 
representing  the  same  pupillary  measurements. 


3.  Pupillary  diameter  during  presentation  and  recall  of  four  digits, 
words  and  a  digit  transformation  task. 


Figure  4. 


Figure  5. 
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FREQUENCY  (in  cp>) 


Average  pupillary  dllaclon  during  the  decision  period  and  percent  errors  as  a 
function  of  the  frequency  of  the  comparison  tone.  The  frequency  of  the  standard 
was  850  cpa. 
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Averaged  evoked  pupillary  responses  in  a  mental  multiplication  task  of  three 
levels  of  problem  difficulty. 
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ABSTRACT 

Three  years  of  aircrew  performance  measurement  related  to  air  combat  effectiveness  using  the  Navy's 
Air  Combat  Maneuvering  Range  (ACMR)  are  presented  as  evidence  of  ACMR's  research  potential.  Performance 
assessment  methods  used  to  evaluate  pilot  proficiency  are  described.  The  aircrew  assessment  methods  have 
been  used  to  identify  squadron  performance  differences,  evaluate  competitive  exercises,  and  provide 
diagnostic  training  feedback  to  operational  users.  The  use  of  continuously  recorded  quantitative  measures 
from  systems  such  as  ACMR  should  stimulate  more  aircrew  performance  field  research  ideas.  The  avilabillty 
of  objective  performance  criteria  promises  to  be  of  substantial  benefit  to  both  the  operational  user  and 
the  research  community  In  such  areas  as  pilot  selection  and  training,  fleet  combat  readiness,  and  pilot 
workload  and  stress. 

INTRODUCTION 

Background :  The  selection,  training  and  assessment  of  military  aviators,  and  problems  associated  with 
the  acquisition  and  retention  of  flying  skill,  have  occupied  avlatic-.  psychologists  foi  over  30  years 
(Thorndike,  1974).  The  major  problem  in  this  line  of  research  has  been,  and  continues  to  be,  the  lack 
of  objective  criteria  (North  and  Griffin,  1977)  for  evaluating  the  effectiveness  of  aviation  training  in 
general,  and  aircrew  proficiency  in  particular.  Traditionally,  the  use  of  subjective  estimates  has 
provided  the  only  means  to  assess  training  progress  in  acquiring  and  maintaining  aviation  skills. 

The  recent  growth  of  the  Navy's  Air  Combat  Maneuvering  Range  (ACMR)  has  provided  a  unique  opportunity 
to  obtain  objective  measures  of  aircrew  performance  that  have  not  been  avilable  in  the  past.  For  the  past 
three  years  the  authors  have  been  involved  in  a  research  program  to  develop  objective  aircrew  performance 
criteria  from  ACMR  quantitative  output  measures.  Two  technical  reports  (BrictBon  and  Clavarelli;  1976  and 
1978)  have  been  written  which  detail  the  technical  approach,  performance  assessment  methods,  and  prelimi¬ 
nary  results  of  aircrew  performance  measurement  on  selected  training  objectives.  The  ACMR  criterion 
development  research  is  sponsored  by  the  Navy  Aerospace  Medical  Research  Laboratory,  Pensacola,  Florida 
in  order  to  provide  them  with  performance  criteria  to  validate,  among  other  things,  vision  laboratory 
results,  aircrew  selection  practices,  and  training  ef fectivenens.  The  availability  of  such  criteria, 
however,  has  perhaps  more  far  reaching  implications  for  the  expansion  of  aircrew  research  efforts  in  an 
operational  environment. 

Air  Combat  Maneuvering  Range  (ACMR):  The  ACMR  is  a  sophisticated  training  facility  acquired  b"  the  Navy 
and  now  in  use  to  train  fighter  aircrews  in  air-to-air  combat.  The  system  is  designed  to  train  aircrews 
in  actual  combat  maneuvers  and  in  recognition  of  weapon  delivery  boundaries.  ACMR  provides  data  display 
features  which  greatly  enhance  air  combat  debriefs,  and  provide  a  rich  source  of  continuously  recorded 
quantitative  measures.  Some  of  the  capabilities  of  ACMR  include  the  following: 

1.  Real-time  tracking  of  aircraft  engaged  in  air  combat  training  in  a  specified  airspace, 

2.  Video  tape  playback  of  flight  history  data,  complete  with  pictorial  display  of  the  air-to-air 
engagement  and  voice  transmissions, 

3.  Both  digital  and  graphic  hard-copy  printouts  of  flight  instrument  date,  interaircraft  pot it ions, 
cockpit  view  of  rngagrd  aircraft,  mission  data,  and 

4.  Computer  generated  estimates  of  weapon  launch  outcomes. 

ACMR  as  a  system  enables  training  and  research  personnel  to  monitor  in  real-time  various  air  combat 
training  exercises,  and  through  axeiciaa  raplay,  provides  the  opportunity  to  review,  debrief  and  evaluate 
pilot  tactics,  decisions,  and  weapon  delivery  accuiacy.  In  addition,  ernc^ted  ACMR  advances  are  designed 
to  obtsln  aeieurea  In.  attach  mission  role*  as  well.  Planned  system  augmun-atlon  will  cover  no-bomb-drop 
scoring,  mine  laying  operations,  antl.-radlst.ion  and  electronic  warfare  liaslonu  The  whole  array  of 
operational  missions  and  their  alow-motlon  raplay  will  soon  ba  within  tbj  province  of  aviation  research 
tcama  to  better  understand  and  resolve  the  complexities  of  pilot/aircre.f t  matchups. 


RESEARCH  REVIEW 

In-Flight  Assessment  Methods:  Our  technical  approach  (Brictson,  Clavarelli,  e£.  al. ,  1977)  describes  an 
appropriate  systems  ftv.Tnwork,  training  content,  and  performance  assessment  methodology  for  the  development 
of  reliable  and  valid  ACMR  criterion  performance  measures. 


Measures  from  over  600  ACMR  dog  fights  have  been  obtained  acroee  a  variety  of  aircraft  and  weapon 
systems,  and  under  varying  training  missions  and  operating  conditions.  A  performance  assessment  methodology 
was  developed  and  usee  to  evaluate  aircrew  and  squadron  air  combat  performance.  The  performance  assessment 
methods  Include  analysis  c‘.'  engagement  outcomes  (wins,  losses,  drawi.)  as  well  as  task  accuracy  measures 
associated  with  successful  weapon  delivery.  Recently,  we  have  developed  metrics  from  the  analysis  of 
antecedent  events  (l.e.  radar  contact,  initial  visual  acquisition,  first  engagement  short)  in  order  to 


estimate  the  probability  of  any  given  outcome,  given  certain  antecedent  conditions 
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Collectively,  these  assessment  methods  provide  a  complete  measurement  system  for  estimating  aircrew 
and  unit  proficiency  in  all  aspects  of  air  combat  maneuvering.  Wc  will  soon  be  able  to  provide  longitu¬ 
dinal  and  objective  data  on  all  critical  phases  of  air  combat  maneuvering. 

Performance  Result:  Since  ACMR  instrumentation  provides  su  many  output  measures  there  is  a  range  of 
l..uiscrimlnant  selection  of  candidate  measures  of  performance.  We  ran  across  many  occasions  where  it  was 
tempting  to  measure  'everything  that  moves, 1  but  we  chose  Instead  to  look  at  the  statistical  and  practical 
aspects  of  the  data — recognizing  fully  that  if  your  results  do  not  make  sense  to  the  operational  community 
they  will  not  be  used. 

To  arrive  at  a  reduced  set  of  candidate  measures  we  first  identified  thirteen  air  combat  training 
objectives  and,  using  various  logical  and  documentary  criteria,  selected  weapon  envelope  recognition  as 
the  most  critical  to  success.  A  comprehensive  statistical  analysis,  using  ANOVA,  multiple  correlation 
and  discriminant  analysis,  resulted  in  the  selection  of  two  statistically  and  practically  significant 
variables  from  the  multitude  of  measures  available  on  ACMR.  In  the  final  analysis  a  single  error  score, 
which  was  defined  as  a  deviation  from  ideal  weapon  delivery  boundary  zones,  proved  to  be  the  most  promising 
measure  of  envelope  recognition  task  accuracy.  Based  on  that  conclusion  we  have  now  developed  empirical 
distributions  of  these  error  scores  for  high  and  low  pilot  performance  and  experience  continous  for  use 
as  baseline  data  to  evaluate  any  future  training  innovations  or  system  improvements  in  envelope  recognition. 

In  general,  the  progress  of  ACMR  performance  criteria  development  has  produced  some  very  promising 
results.  We  have,  for  example: 

o  Identified  key  variables  related  to  successful  weapon  delivery, 

o  Developed  preliminary  criteria  for  evaluating  aircrew  performance  in  envelope  redognition,  and 

o  Devised  scoring  metrics  based  on  engagement  outcomes  and  task  accuracy  measures  which  have 
demonstrated  their  effectiveness  in  discriminating  known  performance  differences. 

More  Importantly,  we  now  have  in-hand  a  list  of  statistically  and  practically  significant  variable  which 
not  only  account  for  ttie  major  portions  of  variance  related  to  air  combat  success  but  are  also  —  and 
this  is  critical  to  measurement  success  —  understood  and  accepted  by  the  operational  user,  l.e.  pilots 
and  training  officers. 

Efforts  are  continuing  to  further  refine  and  expand  performance  assessment  techniques,  and  to 
establish  the  statistical  Integrity  of  the  data  base  for  ultimate  application  in  support  of  both  operational 
training  and  for  validation  of  ongoing  aviator  research.  While  the  training  application  of  these  data 
are  readily  acknowledged,  the  research  aspects  and  potential  have  yet  to  be  realized  in  the  research 
community  at  large.  We  hope  that  this  brief  foray  will  entice  other  aviation  research  teams  to  utilize 
the  tremendous  capability  now  available  in  ACMR  systems  emerging  around  the  world. 

A  NEW  ERRA 

For  the  past  30  years  eviction  psychologists,  given  the  lack  of  objective  operational  measures,  have 
been  forced  to  do  research  designed  primarily  to  enhance  the  reliability  and  validity  of  subjective  and 
second  order  'criterion  measures: '  Usually  the  criterion  measures  so  developed  rested  on  the  use  of 
flight  instructor  subjective  estimates  or  pee-  training  which  met  with  various  degrees  of  success.  With 
the  arrival  of  training  systems  such  as  ACMR  aviation  psychology  has  crossed  the  treshold  into  a  new  era. 

The  avilability  of  continuously  recorded  and  objective  output  measures,  along  with  on-line  computer 
analysis  and  display,  present  the  researcher  with  a  completely  new  capability  to  evaluate  performance  'on 
the  Job.'  Although  much  remains  tc  be  done  to  demonstrate  the  generallzablllty  of  initial  performance 
as8easment  methods  developed  to  date,  the  methods  have  already  been  successfully  demonstrated  across  small 
samples  and  show  remarkable  promise. 

The  utility  of  reliable  and  valid  objective  performance  criteria  can  not,  and  should  not,  be  under- 
stlmated.  From  an  operational  view  point,  the  measures  are  essential  for  judging  the  progress  of  ACMR 
training,  estimating  aircrew  proficiency  levels,  and  for  determining  the  combat  readiness  of  operational 
units. 

On  the  other  hand,  the  research  community  now  has  at  its  disposal  operational  measures  as  potential 
validation  criteria  for  ongoing  aviator  selection,  training  and  research  programs.  The  air  combat  mission 
is  most  certainly  one  of  the  most  demanding  tasks  in  terms  of  skills  required  and  stresses  experienced. 

ACMR  provides  a  vehicle  for  the  field  validation  of  research  directed  at  understanding  the  acquisition 
of  these  skills  and  the  conditions  under  vhlch  they  may  be  enhanced  or  degraded. 

Going  hand-in-hand  with  operational  measures  related  to  aviator  ccmbat  missions  is  the  present 
availability  of  aircraft  carrier  final  approach  landl  cores  (LPS)  which  have  already  been  tested  and 
validated  in  the  fleet  (Brlctson,  et.  al..  1973).  ,._ik  and  carrier  landing  measures,  used  independently 

or  in  combination,  provide  a  unique  opportunity  to  support  ongoing  research  related  to  the  selection, 
training,  and  performance  effectiveness  of  Navy  aviators. 

Given  the  avilability  of  operational  performance  measures,  researchers  can  more  effectively  address 
some  of  the  questions  that  have  arisen  over  the  history  of  aviation  research.  Some  of  these  questions 
are  of  very  high  priority  to  the  nation's  defense  in  general,  and  to  Naval  aviation  in  particular.  For 

example: 

! .  When  is  an  aircrew,  squadron  or  fleet  considered  combat  ready  in  ACM? 


2.  What  are  the  effects  of  ACM  on  pilot  physiological  responses? 
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3.  How  frequent  is  ACM  practice  required  to  naintain  proficiency? 

4.  What  is  the  range  of  pilot  stress  tolerance  to  ACM  missions? 

5.  What  is  an  acceptable  standard  of  ACM  performance  for  different  training  levels? 

6.  How  can  operational  performance  measures  be  used  to  select  top  aviators? 

/.  And  finally,  what  are  the  effects  of  sustained  operations,  prolonged  duty  hours,  and  operational 
workload  on  the  performance  effectiveness  of  Naval  aviatorr? 

The  answers  to  these  and  other  operationally  relevant  questions  can  now  be  obtained  given  access  to 
on-line  performance  measurement  systems  such  as  ACMR  and  represents  an  unequalled  opportunity  for  aircrew 
performance  research. 

RESEARCH  OPPORTUNITIES 

Of  prime  importance  to  research  workers  dealing  with  aviator  workload,  stress  and  fatigue  is  the 
intriguing  notion  of  an  on-line  pilot  monitor  system  during  air  combat  missions.  Long  considered  to  be 
one  of  the  more  stressful  and  demanding  pilot  tasks,  an  air-to-air  engagement  taxes  the  pilot  physically, 
mentally  and  perceptually.  The  possibility  of  complimenting  on-line  pilot  performance  measures  with 
on-line  physiological  measures  such  as  heart  rate,  blood  pressure,  etc.  would  provide  an  ideal  arrangement 
for  the  research  team  Interested  in  validating  laboratory  notiont  of  stress,  fatigue  or  workload  in  an 
operational  'real  world'  environment. 

A  word  of  caution  is  advised.  Some  research  teams  used  to  the  controls  and  precision  design  of 
experiments  in  the  laboratory  will  be  limited  in  their  attempts  to  control  the  real  world.  But  that  is 
exactly  the  point.  Many  laboratory  studies  stress  the  statistical  significance  of  results  without  strong 
support  for  practical  or  operational  significance.  In  pilot  workload,  for  example,  the  amount  or  severity 
of  workload  in  either  a  24-hour  or  flight  segment  is  certainly  useful  to  'describe'  the  environment  but 
does  not  by  itself  have  any  practical  significance  unless  it  can  be  related  to  performance  effectiveness, 
short  or  long  term.  Our  physiological  reactions  to  stress  or  workload  can  assuredly  be  measured  but  it 
is  only  in  the  context  of  their  relation  to  performance  that  they  acquire  operational  significant e. 

With  the  advent  of  sophisticated  instrumentation  systems  like  ACMR  and  the  concurrent  development  of 
performance  criterion  measures  the  final  building  block  in  field  calibrated  research  is  in  place.  All 
that  now  remains  is  the  historical  challenge  of  innovative  and  understandable  test  designs  that  can  answer 
operationally  significant  problems. 

Out  own  approach  in  ACMR  is  to  provide,  first  of  all,  valid  and  reliable  performance  criteria. 
Secondly,  we  want  to  obtain  a  longitudinal  performance  data  bank  based  on  pilot  biographic,  experience, 
biochemical,  sleep,  mood  and  workload  components.  Third,  and  most  Important,  is  our  Interest  in  having 
a  field  laboratory  that  car.  provide  an  arena  to  explore,  define  and  predict  the  influence  of  pilot  temporal 
variables  on  aviation  performance  effectiveness. 

The  ACMR  system,  while  now  prevalent  in  the  contental  U.S.A.,  is  also  being  made  available  to  NATO 
nations  for  training  purposes  at  a  location  in  Sardinia.  NATO  scientists,  ideally,  could  hove  access  to 
the  performance  data  through  part-time  use  of  the  facility  for  research  purposes.  Many  of  the  papers 
recently  discussed  at  the  1977  Cologn  AGARD  Panel  meeting  on  pilot  workload  could  benefit  from  on-line 
performance  measurement  data  such  as  that  provided  by  ACMR.  In  addition  to  land  based  ACMR  systems  there 
is  a  strong  likelihood  that  ACMR,  with  its  vast  potential  for  tapping  continuously  many  aspects  of  pilot 
performance  and  physiological  responses,  will  also  be  available  at  sea,  aboard  various  U.S.  Navy  aircraft 
carriers.  If  that  planned  Installation  occurs  then  the  use  of  ACMR  for  research  purposes  could  greatly 
expand  due  to  greater  availability  of  ACMR  facilities  at  sea  and  ashore.  Regardless,  it  is  now  possible 
to  obtain  from  ACMR  reliable  and  valid  operational  meausres  of  air  combat  maneuvering.  Such  measures 
should  provide  a  werlth  of  opportunity  for  research  teams  from  NATO,  USN  and  USAF  communities. 
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SUM1ARY 

The  use  of  speech  patterns  in  the  analysis  of  workload  is  examined.  The  rather  sparse  amount  of 
research  effort  expended  in  this  field  is  reviewed  in  terms  of  a  simple  model  of  speech  production  and  the 
applications  of  current  analysis  techniques  are  considered. 


INTRODUCTION 

There  is  much  intuitive  evidence  to  suggest  that  high  workload  or  stress  may  change  the  fundamental 
characteristics  of  speech,  and  so  although  the  voice  may  not  exhibit  obvious  variations  during  normal  flight 
profiles,  u  search  for  change  in  speech  may  prove  to  be  a  worthwhile  approach  in  the  investigation  of  work¬ 
load  in  air  operations.  However,  central  to  the  possible  use  of  speech  patterns  is  the  requirement  to 
reduce  complex  speech  date  to  parameter  sets  of  a  manageable  size,  and  to  relate  these  sets  to  the  psycho¬ 
logical  and  physiological  state  of  the  pilot.  Optimum  choice  of  parameter  sets  constitutes  a  difficult 
task,  but  there  is  an  ever  increasing  literature  concerned  with  speech  processing  which  provides  many 
techniques  of  analysis. 

Reliable  voice  parameters  may  be  extracted  from  the  relatively  poor  quality  speech  of  existing  flight 
communication  channels,  and  so  speech  patterns  may  prove  to  be  useful,  as  they  overcome  the  need  for  sub¬ 
ject  instrumentation  and  data  collection  (see  for  example  Refs  1-3) .  Correct  choice  of  speech  parameters 
may  make  it  possible  to  assess  changing  workload  patterns,  and  this  may  be  important  in  the  military 
environment  where  rapid  fluctuations  in  workload  and  stress  are  encountered,  and  where  many  other  methods, 
such  as  those  which  rely  on  biochemical  analysis  (see  for  example  Ref  4),  may  be  of  little  value. 

THE  SPEECH  WAVEFORM 

There  are  several  questions  raised  by  the  use  of  voice  analysis  in  the  investigation  of  workload  in 
aircrew.  In  cannon  with  many  other  techniques,  the  processes  underlying  the  variations  in  voice  parameters 
are  uncertain.  An  effect  may  be  produced  in  response  to  endocrine  changes  when  it  is  likely  that  the  res¬ 
ponse  time  will  be  long  in  relation  to  the  duration  of  the  workload  and  the  induced  stress.  If  on  the  other 
hand,  the  change  arises  essentially  from  increased  neurophysiological  activity,  then  the  response  time  of 
the  effect  will  be  rapid.  In  each  case,  the  basic  mechanisms  of  speech  production  would  be  similar  and  are 
documented  in  the  literature  (see  for  example  Ref  5) ,  but  there  is  little  information  concerning  the  pro¬ 
cesses  which  may  invoke  variation  under  high  workload. 

Further,  vocalization  is  a  conscious  process.  The  majority  of  fundamental  parameters  which 
characterise  a  given  speaker  may  be  split  into  those  which  transmit  the  information  content  of  the  spoken 
word,  and  those  which  do  not  contain  this  semantic  infoi nation.  The  former  group,  for  example  formant 
frequencies,  vary  according  to  the  particular  utterance,  but  over  long  periods  of  time,  as  changes  due  to 
semantic  information  average  out,  the  range  of  variation  is  beyond  direct  conscious  control.  The  latter 
group  contain  measures  such  as  fundamental  frequency  or  pitch,  which  are  responsible  for  intonation. 

Western  languages  do  not  require  changes  in  pitch  to  transmit  semantic  information,  but  conscious  control 
may  produce  eignlf i  ant  short  term  variations.  Although  the  nature  of  the  short  term  changes  may  be  of 
interest,  mean  fundamental  frequency  in  the  long  term  is  again  thought  to  be  beyond  direct  conscious  con¬ 
trol. 


In  view  of  these  considerations  it  is  worth  reviewing  the  way  in  which  a  set  of  parameter  estimates 
should  be  used.  As  an  illustration,  a  voice  parameter  from  a  pilot  is  measured  during  the  course  of  a 
single  flight,  and  it  is  assumed  that,  initially,  there  is  no  knowledge  of  the  influence  of  high  workload. 
The  time  course  distribution  of  the  estimates  of  this  parameter  during  the  flight  will  depend  upon  the  times 
at  which  the  pilot  chooses  to  Bpeak.  The  estimates  are  likely  to  be  corrupted  by  noise  due  to  poor  record¬ 
ings,  problems  of  measurement  and  random  or  conscious  variations  in  the  pilot's  voice.  The  absolute  values 
are  likely  to  be  of  little  value,  but  relative  changes  through  the  flight  profile  may  be  of  greater 
interest.  Statistical  methods  are  available  to  establish  whether  any  trends  exist,  and  to  test  if  a 
particular  aspect  of  the  flight  profile  shows  a  significant  change.  Similar  methods  would  be  applicable  if 
estimates  of  the  same  speech  parameters  from  an  unstressed  situation  were  available,  either  in  flight  or  on 
the  ground.  Measures  of  the  relative  change  could  be  quite  useful,  given  a  sufficient  knowledge  of  the 
flight  profile  which  would  identify  times  of  high  workload.  However,  utterances  are  often  short  and 
randomly  dispersed  through  the  flight  profile,  and  so  in  preliminary  studies,  it  would  be  desirable  to 
correlate  with  other  physiological  data.  Such  data  are  easier  to  gather  from  transport  flights  because  of 
the  ease  of  instrunentating  the  pilot. 

Data  from  many  flights  are  necessary  to  establish  the  existence  or  otherwise  of  specific  trends 
related  to  high  workload,  and  if  trends  are  found,  it  would  be  necessary  to  establish  whether  they  are 
reproducible  in  different  pilots .  Studies  of  changes  in  the  voice  under  stress  have  demonstrated  wide 
inter-subject  variability  (Refs  6-7),  and  these  observations  raise  a  much  broader  question,  concerning  the 
way  in  which  various  parameters  from  different  pilots  could  be  evaluated  for  indications  of  high  workload. 
Given  speech  data  from  a  single  pilot,  it  in  possible  to  use  techniques  of  ever  increasing  complexity  until 
changes  are  found  which  significantly  reflect  high  workload  situations.  Such  studies  are  time  consuming, 
and,  even  so,  the  final  technique  may  or  may  not  be  relevant  to  other  pilots.  Alternatively,  more  simple 
techniques  may  be  applied  to  data  from  several  pilots  in  an  effort  to  establish  trends  across  pilots. 
Intuitively,  the  latter  approach  is  felt  to  be  more  realistic,  even  if  some  aspects  of  speech  requiring 
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involved  methods  of  extraction  are  ignored  . 


...  I\i“  “lso  n“C®a“ry  to  bear  in  mind  the  technique*  of  analyeia  which  are  available,  and  how  they 
relate  to  the  quality  of  the  available  data.  For  example,  inverse  filtering  methods  may  be  used  to  obtain 
the  shape  of  the  glottal  pulse,  which  is  the  basic  element  in  the  quasi-periodic  waveform  responsible  for 
fundamental  frequency.  To  do  this,  speech  recordings  of  adequate  quality  are  essential  (Ref  8) .  Finally, 
when  considering  the  existing  literature,  the  relevant  information  is  sparse,  but  section  4  offers  a  brief 
review,  in  which  it  is  hoped  the  problem  and  various  approaches  may  be  put  into  perspective. 

MODEL  OF  SPEECH  PRODUCTION 


Though  a  description  of  the  physiology  of  speech  production  would  ba  out  of  place  in  this  review 
nevertheless,  it  is  worthwhile  to  describe  the  genesis  of  the  speech  waveform.  This  presents  the  opp^r- 
to  a8pects  of  the  waveform  which  can  be  measured,  and  which  may  reflect  a  high  workload 

situation.  Figure  1  illustrates  the  way  in  which  speech  may  be  broken  down  into  its  phonetic  components. 


Pig  1 

Phonetic  composition  of  the  speech  waveform 


Any  utterance  consists  of  periods  of  vocal  activity  and  non-activity,  known  respectively  as  speech 
Intervals  and  pause  intervals.  In  isolation,  the  latter  are  of  no  interest,  but  together  they  provide 
information  on  the  speech  pause  ratio,  and  on  the  overall  rate  at  which  the  pilot  is  talking.  This 
apparently  trivial  point  is  of  some  importance,  especially  when  obtained  as  part  of  an  analysis  of  the 
speech  waveform  envelope  shape.  The  envelope  shape  reflects  the  duration  of  phonetic  segments  as  well  as 
overall  articulation,  or  the  precision  with  which  different  sounds  are  produced.  Such  measurements  may 
contain  information  on  high  workload  situations  (Ref  6) ,  even  though  it  is  likely  that  this  information 
°nly  r®fle,=t8  changes  in  the  pattern  of  respiration.  Unfortunately,  the  discontinuous  nature  of  cockpit 
communication  rarely  provides  a  speech  epoch  of  sufficient  length  for  this  form  of  analysis. 

Figure  1  also  shows  that  speech  intervals  may  be  divided  into  voiced  and  unvoiced  segments.  This 
broad  classification  is  dependent  upon  the  presence,  or  absence,  of  vocal  chord  activity.  Speech  intervals 
can  be  described  by  the  model  illustrated  in  Fig  2,  which  is  based  on  the  acoustic  theory  of  speech  pro- 
uc  on  (Ref  5) .  In  digital  form,  the  model  has  found  extensive  application  in  computer  based  analyses 
which  extract  voice  parameters  from  the  speech  waveform  (see  for  example  Ref  8-12)  .  The  first  part  of  the 
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Fig  2 

Model  based  on  the  acoustic  theory  of  speech  production 


117 


nodal  comprises  two  possible  excitation  sources  and  a  source  filter.  The  type  of  speech  depends  largely 
on  the  excitation  source,  with  the  randan  noise  generator  producing  unvoiced  sounds  or  fricatives.  In 
actual  speech,  a  constriction  is  formed  in  the  vocal  tract  and  air  is  forced  through  it,  generating  tur¬ 
bulence,  and  hence  noise.  A  combination  of  the  random  noise  generator  and  Impulse  traj  generator  can 
produce  the  so  called  voiced fricatives,  and  plosive  sounds  are  created  in  a  transitional  phase  between 
pause  intervals  ani  voiced  or  unvoiced  speech  intervals.  However,  none  of  these  three  types  of  sound  has 
any  simple  application  to  the  current  problem. 

Mare  important  are  the  vowel  sounds,  or  voiced  speech  sounds,  which  are  derived  from  the  quasi- 
periodic  impulse  train  generator.  The  instantaneous  period  of  the  pulses  defines  the  fundamental  frequency 
of  the  voice,  which  usually  lies  in  the  range  00-300  Hz.  Many  speech  analysis-synthesis  systems  are  based 
on  voiced  speech  models,  and  as  a  consequence  the  vocal  source  spectrum  and  vocal  fc.act  resonators  need 
only  be  considered  In  relation  to  this  type  of  excitation.  The  concept  of  an  impulse  train  is  an 
idealization,  because  practically,  puffs  of  air  are  ral-»ased  into  the  vocal  tract  by  vibration  of  the  vocal 
chords.  The  shape  of  each  puff,  known  as  the  glottal  pulse.  Is  determined  by  the  vocal  source  spectrum, 
and  Is  largely  dependent  upon  the  state  of  the  larynx  and  vocal  chorda.  Figure  3  illustrates  a  simple 
electrical  circuit  which  represents  the  sub-glottal  system.  The  bronchi  and  trachea  ere  represented  as  T 


Fig  3 

An  electrical  circuit  representation  cf  the  sub-glottal  system 

sections,  driven  by  a  voltage  representing  the  alveolar  pressure,  Paly.  Elastic  recoil  In  the  lungs,  ie 
charge  on  the  lung  compliance  capacitor  C,  is  sufficient  to  produce  normal  expiratory  airflow.  The  lung 
tissue  resistance  Is  negligible  and  may  be  Ignored.  During  phonation,  however,  inspiratory  muscle  activity 
will  produce  a  negative,  (subatmospheric)  Intrapleural  pressure  and  impede  expiration.  This  produces  a 
highly  regulated  expiratory  flow  through  the  glottal  area.  Usually  airflow  is  small  and  so  the  sub-glottal 
pressure,  Pa,  and  alveolar  pressure  are  nearly  the  same.  The  resistance  and  inductance,  Rg  and  Lg  respec¬ 
tively,  represent  the  variable  area  glottal  orifice.  For  voiced  sounds  in  the  normal  pitch  range,  the 
resistive  term  is  dominant.  In  the  context  of  stress  analysis,  the  properties  of  this  model  are 
conveniently  summarized  by  the  fundamental  frequency  of  vocal  chord  activity  and  vocal  source  spectrum,  as 
viewed  from  the  vocal  tract. 

The  physical  factors  which  control  fundamental  frequency  and  the  vocal  source  spectrum  are  closely 
related,  and  It  has  been  suggested  that  they  are  Important  In  evaluating  high  workload  situations  (Refs  7 
and  13) .  This  Implies  that  the  larynx  Is  subject  to  the  normal  neuromuscular  manifestations  of  stressful 
situations  (Ref  14) .  Once  again,  the  respiratory  pattern  may  be  important  since  an  increase  in  sub-glottal 
pressure  can  change  the  shape  of  the  vocal  source  spectrum  by  effectively  narrowing  the  glottal  pulse. 

Much  of  the  literature  concerned  with  Btress  in  the  human  voice  has  used  fundamental  frequency  as  the 
indicator,  but  the  glottal  waveform  has  found  little  application,  presumably,  due  to  the  computational  com¬ 
plexities  Involved  In  its  measurement. 

The  final  feature  of  the  vocal  source  Is  the  gain  multiplier,  which  has  the  effect  of  controlling  the 
overall  loudness  of  the  speech  signal.  Except  under  controlled  recording  conditions,  it  is  difficult  to 
make  use  of  amplitude  Information,  or  equivalently,  absolute  values  in  power  spectra.  There  is  an  added 
complication  in  that  an  increase  in  the  loudness  of  the  voice  is  generally  accompanied  by  an  increase  in 
fundamental  frequency.  In  the  final  stages  of  a  let-down,  approach  and  landing,  a  possible  increase  in  the 
fundamental  frequency  of  the  pilot's  voice  may  not  be  due  to  high  workload,  but  rather,  an  increase  in 
voice  loudness  related  to  increased  engine  noise. 

The  vocal  tract  is  characterised  by  a  scries  combination  of  quasi-time-invariant  linear  band-pass 
filters,  which  are  often  termed  formants.  Each  filter  is  characterised  by  a  resonant  frequency  and  a  band¬ 
width.  Realistic  estimates  of  the  resonant  frequencies  or  formant  frequencies  may  be  obtained  from 
relatively  straightforward  processing  schemes,  but  this  is  not  true  of  the  formant  bandwidths.  However, 
speech  experiments  have  suggested  that  with  c<i stant  bandwidth  parameters,  time  varying  combinations  of  the 
formant  frequencies  offer  realistic  syntheses  of  tbe  vowel  sounds  In  sections  of  voiced  speech. 
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The  essential  structure  of  each  formant  filter  is  shown  in  Fig  4.  The  filter  transfer  r  .nction  is  given  as 

H(s>  -  — - i - 

S*LC  +  SBC  +  1 

and  it  follows  that  the  spectral  peak  occurs  at 

,,_1  -  R2. 

Wmax  "  'LC  4L^ 


L,  C  and  R  summarize  the  properties  of  air  motion  in  a  cylindrical  tube.  L  is  an  acoustic  inertance  and 
remains  essentially  constant.  C  is  a  compliance  term  which  depends  on  the  cross-sectional  area  ot  the 
vocal  tract,  while  R  is  a  viscous  drag  term,  dependent  upon  both  the  crcss-sectional  area  and  the  circum¬ 
ference  of  the  vocal  tract.  Essentially,  R  controls  the,  formant  bandwidun  and  C  the  formant  frequency. 

Both  of  these  parameters  vary  relatively  slowly  and  so  the  formant  system  may  be  regarded  as  invariant  in 
terms  of  short-time  analysis.  In  this  context  the  process  is  considered  stationary  during  periods  of  20  mS 
or  less. 
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Fig  4 

Electrical  analogue  of  a  single  formant  resonator 

In  theorv  there  is  an  infinite  numoer  of  formants,  but  in  practice,  three  or  four  are  sufficient  to 
characterise  a  voice,  although  the  acoustic  theory  of  speech  requires  further  filter  elements  for  the 
correct  representation  of  nasal  consonants  (Ref  5)  .  In  the  male,  empirical  data  suggest  the  fi.'st  formant 
lies  in  the  range  200-900  Hz,  the  second  formant  in  the  range  550-2700  Hz  and  the  third  formant  in  the 
range  1100-2950  Hz  (Ref  10) .  Physically,  the  formant  resonators  comprise  the  cavities  of  the  pharynx  and 
oral  and  nasal  cavities.  The  tongue,  jaws  and  lips  are  also  able  to  modify  the  low  order  formants.  Some 
studies  have  considered  possible  interactions  of  stress  with  formants,  essentially  by  examining  changes  in 
spectral  balance  within  the  formar.t  frequency  range.  Such  studies  have  been  qualitative  as  well  as 
quantitative  (Ref  15-17'  ,  ar.d  >vill  be  considered  in  more  detail  later.  Intuitively,  however,  since  it  is 
the  g’.cttel  waveform  which  actually  chat  icterises  the  voice,  the  vocal  tract  will  be  of  less  interest,  as 
it  aereiy  shapes  the  glottal  waveform  to  produce  semantic  information  (Ref  18) . 

The  last  component  in  the  mcdel  of  speech  production  is  the  radiation  load.  This  filters  the  speech 
signal  according  to  the  way  in  which  the  vocal  tract  is  coupled,  via  the  mouth,  to  free  space.  In  speech 
analysis  applications  it  is  often  of  greatest  importance  to  obtain  fundamental  frequency  and  formant 
parameters,  and  so  the  characteristics  of  the  vocal  source  spectrum  and  radiation  load  spectrum  may  be 
lumped  tog  - ther  and  removed  from  the  speech  signal,  ar  both  may  be  considered  time  invariant. 

MEArii’REu  OF  STRESS  IN  THE  SPEECH  SIGNAL 

In  this  section  we  arc  concerned  with  methods  which  have  been  used  to  establish  whether  stress  modifies 
the  speech  signal. 

Voice  Micro-tremor .  A! though  the  previous  sec on  presented  a  model  of  speech  production  and  highlighted 
the  aspects  of  vocalisation  which  are  likely  to  reflect  stress,  there  is  a  further  phenomenon  known  as  voice 
micro-tremor,  which  does  not  fit  into  the  scheme,  but  is  nevertheless  important.  Tremor,  or  to  be  more 
specific,  an  S-12  Hz  modulation  in  the  human  voice,  is  a  fairly  recent  discovery.  Commercially ,  the 
phenomenon  has  found  application  as  a-.i  extension  to  polygraph  lie  detector  methods,  and  appears  to  have  me), 
with  some  succes,  at  least  in  a  well  structured  interview  situation.  One  of  the  first  devices  offered 
simple  strip  chart  recorder  output,  and  required  a  skilled  operator  to  interpret  the  results  (Ref  19) .  A 
more  recent  device  has  a  direct  digital  readout  of  stress  level,  but  there  is  little  technical  information 
on  its  operation  (Ref  20) . 

The  proponents  of  such  devices  have  attempted  to  explain  the  principles  behind  voice  tremor,  and, 
essentially,  it  is  assumed  that  the  muscles  controlling  the  vocal  chords  exhibit  the  sort  of  tremor  which 
accompanies  activity  in  any  of  the  voluntary  muscles.  It  is  postulated  that  this  will  cause  slight  rhythmic 
changeu  in  vocal  chord  tension  which  will  result  in  an  inaudible  8-12  Hz  modulation  of  fundamental 
irequency.  Similarly,  the  muscles  controlling  the  throat,  lips  and  tongue  are  thought  to  be  sensitive 
to  the  same  kind  of  tremor,  which  will  be  reflected  as  a  modulation  within  the  first  formant  bandwidth.  In 
a  stress  situation  it  is  assumed  that  increaned  nervous  activity  causes  muscle  tension  to  increase 
throughout  the  body  and,  in  the  larynx  at  least,  this  will  reduce  the  micro-tremor.  In  a  high  stress 
situation  voice  micro-tremor  may  disappear  altogether.  In  view  of  the  supposed  mechanism  of  voice  tremor, 
this  is  a  rather  curious  observation,  since  other  manifestations  of  muscle  tremor  appear  to  increase  in 
the  high  workload  situation  (eg  Ref  1) .  However,  Inbar  et  al  (Ref  21)  have  attempted  to  measure  voice 
tremor  and  correlate  it  with  muscle  tremor  in  the  area  of  the  larynx.  This  technique  was  used  to 
establish  if  voice  tremor  was  due  to  mechanical  "subresonances”  in  the  vocal  tract,  or  if  it  was 
generated  by  increased  nervous  activity.  Their  results  suggest  that  micro-tremor  is  a  frequency  modulation 
of  the  glottal  waveform,  and  that  it  Is  generated  by  nervous  activity.  Frequency  modulations  were  also 


110 


detested  in  the  first  and  third  formant  bandwidth*. 

Although  tha  commercial  application  of  tha  volca  trrcior  pharooanon  is  not  in  lina  with  tha  currant 
application,  tha  underlying  procaaa  would  appear  too  aaka  furthar  invaatigation  worthwhile.  Savaral 
atudiaa  of  connarcial  davicaa  hava  baan  undartakan  (ag  Ref  14  a  22) .  Oldar  and  Jenny  (Raf  14) t  carrlad  out 
a  comprehensive  evaluation  uaing  tha  voice*  of  astronauts  from  Sky lab  ill  and  Sky lab  IV  miseionr.  Their 
conclusions  suggested  that  tha  voice  tremor  principle,  as  exploited  in  commercial  devices,  would  not  datact 
any  possible  stress,  at  least  in  tha  Skylab  situation.  This  may  suggest  that  such  davicas  are  of  real 
value  only  in  tha  structured  interview  application.  However ,  it  should  ba  pointed  out  that  tha  commercial 
davicas  appear  to  ba  of  simple  design,  and  since  wa  are  not  aware  of  any  adequate  investigations  into  the 
use  of  tha  voice  tremor  phenomenon  in  a  stress  situation,  the  use  of  micro-tremor  in  the  analysis  of  stress 
nay  still  prove  to  be  a  useful  approach. 

General  Spectroqraphlc  Measurements .  These  methods  attempt  to  quantify  sound  spectrograms  either  by 
visual  inspection,  or  by  direct  measurement.  Such  method*  can  ba  affective  in  demonstrating  changes  in 
the  voica,  but  it  is  difficult  to  obtain  pracisa  measure*.  The  most  important  apectrographic  analyses  hava 
used  either  wide  band  filtsra  (200-400  Hi  bandwidth)  to  emphasise  the  formant  resonances  in  the  speech 
spectrogram,  or  narrow  band  filters  (lass  than  50  Ha  bandwidth)  to  highlight  the  harmonic  structure  due  to 
fundamental  frequency. 

Kuroda  at  al  (Raf  13)  hava  defined  a  quantity  from  the  narrow  band  spectrogram  known  as  tha  vibration 
apace  shift  ratio  (VSSR) .  This  ia  simply  derived  from  measurement  of  tha  frequency  band  spacing  during 
voiced  speech,  and  relates  to  tha  relative  changes  in  fundamental  frequency  between  normal  and  high  stress 
situations.  Thus  if  in  the  normal  situation,  frequency  band  spacing  ia  given  by  SVS ,  and  in  tha  high  stress 
situation,  by  KVS,  than 


VSSR 
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Real  situations  in  which  military  pilots  found  themselves  in  difficulties  were  examined.  Highly  signify  'ant 
Incraaaea  in  fundamental  frequency  were  reflected  in  the  VSSR,  but  each  case  represented  a  catastrophic 
situation  and  three  are  known  to  have  reaulted  in  a  fatal  accident.  Generally,  in  such  situations,  machine 
analysis  is  not  necessary  to  demonstrate  the  gross  increases  in  fundamental  frequency  attributed  to  both 
intense  fear  and  concomitant  increases  in  voice  loudneaa.  In  the  more  coomonly  encountered  high  workload 
or  stressful  situation,  changes  in  voice  parameters  would  be  expected  to  be  much  leas  dramatic,  and  only 
than  would  It  be  necessary  to  usa  soma  form  of  machina  analysis. 


More  general  evaluations  of  the  way  stress  may  appear  in  the  spectrogram  have  been  carried  out  by 
Williams,  Stevens  et  al  (Ref  6,  7,  18).  Seme  early  studies  used  data,  and  produced  results,  which  were 
very  similar  to  those  detailed  above,  although  the  fundamental  frequency  contour  was  also  deemed  to  be  of 
importance.  More  comprehensive  studies  attempted  to  extract  as  much  Information  as  possible  from  the 
spectrogram,  largely  by  inspection.  For  instance,  irregular  structure  in  tha  second  and  third  formant 
regions  of  a  wide  band  spectrogram  is  thought  to  reflect  a  non-stable  glottal  waveform.  The  results  of 
these  studies  have  been  summarised  by  the  changes  in  voice  attributable  to  four  emotions,  namely  anger, 
fear,  neutral  and  sorrow.  These  four  emotions,  in  that  order,  tended  to  produce  a  fundamental  frequency 
which  decreased  in  magnitude  and  range.  Irregular  glottal  pulses  were  often  seen  in  the  anger  and  sorrow 
situations,  while  unusual  pitch  contours  were  characteristic  of  the  fear  situation.  Changes  were  also 
noted  in  the  syllabic  rate  and  duration  of  utterances.  Howevor,  it  should  be  noted  that  the  majority  of 
these  results  were  obtained  in  the  laboratory  situation.  Two  methods  have  been  used.  Early  methods 
attempted  to  induce  stress  using  an  arithmetic  task  (Ref  6) ,  but  this  is  likely  to  produce  failure  stress 
as  well  as  task  Induced  stress.  Wide  intersubject  variability  was  observed.  The  second  method  used 
actors,  and  the  majority  of  the  above  results  were  obtained  in  this  situation.  Between  the  different 
emotions,  the  actors  were  able  to  produce  clear  changes  in  their  speech,  but  the  application  of  these 
results  to  the  real  situation  is  op»n  to  question.  This  is  particularly  true  of  the  flying  task,  where  the 
range  of  emotions  is  not  directly  applicable,  and  where  again,  changes  in  the  voice  characteristics  of 
highly  trained  pilots  may  be  expected  to  be  subtle  in  all  but  tha  most  extreme  situation. 

Average  Spectrum  Measurements.  A  second  example  of  stress  analysis  methods  which  makes  use  of  spectral 
information  uses  the  average  spectrum.  This  is  the  spectrum  of  a  complete  utterance,  and  may  involve  a 
single  word  or  a  longer  phrase,  changes  in  fundamental  and  formant  frequencies  during  the  utterance 
give  pitch  and  formant  peaks  a  wider  and  flatter  appearance  in  the  average  spectrum,  and  reflect  the 
overall  characteristics  of  the  voice  during  the  complete  utterance. 


Tishchenko  (Ref  15)  has  suggested  that  formant  frequencies  tend  to  change  in  the  stressed  situation, 
and  that  spectral  intensities  within  these  bands  also  change.  This  led  to  the  definition  of  the  formant 
momentum,  which  is  the  product  of  a  formant  frequency  and  its  intensity.  The  data  in  Tishchenko's  study 
consisted  of  speech  from  23  students  before,  during  and  after  their  first  parachute  jump.  All  of  the 
spectral  analysis  methods  were  analogue.  Generally,  the  first  formant  momentum  increased  in  the  strs3s 
situation,  and  the  second  and  third  momenta  usually  decreased  although  greater  variability  was  observed. 
This  behaviour  ■■•as  explained  physiologically,  but  account  had  to  be  taken  of  the  different  vowel  sounds 
present  in  the  single  words  which  vere  analysed.  Thus  shifts  in  formants  due  to  different  vowels  could 
augment  or  reduce  apparent  shifts  due  to  striss. 


Popov,  Simonov,  Frolov  et  al  have  attar  >ted  txi  analyse  stress  and  emotions  using  spectral  methods 
(Ref  16,  17,  24).  Early  work  wan  directed  t wards  the  measurement  of  changes  in  the  average  formant 
structure  of  single  words  (Ref  16) .  Results'  rimilar  to  those  suggested  above  by  Tishchenko  were  reported. 
The  data  were  obtained  from  actors,  but  further  studies  used  speech  from  the  cosmonauts  in  the  Voskhod  2 
spacecraft.  Again  a  centroid  spectrum  method  was  used  to  give  an  indication  of  relative  shifts  in  formant 
peaks.  Analogue  spectral  techniques  defined  the  average  speotrvsn  centroid  as 
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where  the  f,  are  tha  f 11 tar  tuning  frequencies  and  tha  ara  tha  avtrage  power  outputs  of  tha 
filters,  measured  over  tha  affactiva  time  of  output  for  aach  filtar.  Choica  of  U  appaarad  to  ba 
empirical,  but  significant  ralationahipa  warn  established  between  ralativc  changes  in  tha  cantroid  and 
heart  rata,  during  various  atagas  of  tha  apaca  flight.  This,  togethar  with  a  knowledge  cf  tha  cosmonauts ' 
tasks  at  aach  flight  stags,  suggested  that  changas  in  f  j  could  raflact  a  strassfv.1  situation.  Further 
studias  hava  analysad  tha  amralopa  shapo  of  tha  output  of  aach  of  tha  bandpass  filtars.  Spacif ically ,  tha 
time  intagral  of  tha  output  anvalopa  of  each  bandpass  filtar  is  calculated  as 
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where  T  is  tha  analysis  period  and  tha  ai -  the  envelope  shapes.  It  is  suggested  that  empirical 
combinations  of  tha  can  actually  distinguish  between  different  typos  of  emotion,  labelled  as  fear, 
anxiety,  Joy  and  delight  (Raf  16-17) .  However,  reasons  for  the  choice  of  combinations  are  not  explained. 

Nilliams  and  Stavena  (Raf  7)  hava  described  similar  methods  in  which  they  have  analysed  the  average 
spectra  from  several  seconds  of  speech.  Their  findings  are  merely  consistent  with  increases  in  speech 
loudness  in  the  "anger"  situation,  and  a  decrease  in  speech  loudness  in  the  "sorrow"  situation.  Tha  data 
were  derived  from  actors'  speech. 

Direct  Measurements  on  tha  Speech  Waveform.  Recant  studies  by  Simonov  et  al  (Ref  23)  have  suggested  that 
crude  measures  of  fundamental  frequency  (designated  by  Fjyj,)  and  first  formant  frequency  (designated  by  F0) 
may  be  used  to  discern  emotional  states.  Apparently,  these  parameters  were  extracted  directly  from  the 
speech  waveform,  but  the  analysis  methods  were  not  described  in  any  detail.  However,  variations  in  each 
of  the  parameters,  approaching  loot,  were  reported.  The  use  of  a  discriminant  function  in  the  FOT  -  fQ 
plane  was  therefore  suggested  to  differentiate  between  so  called  states  of  rest  and  emotion.  Again,  the 
main  bulk  of  tha  work  was  performed  with  data  derived  from  actors'  speech,  although  the  validity  of  the 
method  was  supposedly  confirmed  using  speech  obtained  from  amateur  parachute  jumpers.  to  the  amateur,  a 
parachute  jump  clearly  presents  a  highly  stressful  situation,  but  no  mention  was  made  of  possible  physical 
stress  interactions. 


We  have  performed  similar  experiments  with  the  voice  of  a  commercial  airline  pilot  (Ref  25) .  Cepstrum 
methods  were  used  to  obtain  fundamental  frequency  estimates  and  to  smooth  log  magnitude  spectra,  from  which 
formant  information  could  be  extracted  (cepstrum  methodologies  are  described  in  the  next  section) .  The 
data  consisted  of  22  landings  into  various  international  airports.  For  each  landing,  the  baseline  or 
unstressed  fundamental  frequency  and  first  formant  parameters  were  obtained  from  about  30  seconds  of  speech 
at  the  top  of  descent.  Parameters  in  a  stressed  situation,  as  indicated  by  an  increase  in  heart  rate,  were 
obtained  from  about  30  seconda  of  speech  taken  around  the  touchdown  instant.  Thus  for  each  landing, 
specified  by  the  index  i,  the  data  yielded  four ..parameter * :  fQi |u  and  foi|,  are  respectively  the  unstressed 
and  stressed  mean  fundamental  frequencies,  and  Fj^ |u  and  ara  respectively  the  unstressed  and  stressed 

mean  formant  frequencies.  Normalisation  of  the  data  was  carried  out  in  terms  of  the  overall  mean  parameter 
values  derived  from  the  unstressed  data  in  all  22  landings.  Thus 
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where  “  signifies  normalised.  These  data  are  summarised  in  Fig  5  which  plots  parameter 
variations  from  the  stressed  and  unstressed  centroids  in  each  of  the  22  landings.  The  centroid  of  the 
unstressed  data  lies  at  the  origin,  but,  it  can  be  seen  that  the  stressed  data  centroid  is  shifted  to  a 
position  representing  an  increase  in  first  formant  frequency,  but  s  decrease  in  fundamental  frequency. 
The  distance  between  centroids  reflects  the  degree  to  which  stress  is  manifested  in  these  two  speech 
parameters.  An  application  of  the  T^  test  to  the  raw  data  demonstrates  a  difference  in  centroids 
(P  <  0.0002) ,  but  Fig  5  does  suggest  that  the  discriminating  power  of  these  two  clusters  may  be  somewhat 
restricted  (Ref  26) .  A  section  of  speech  obtained  from  either  the  top  of  descent  or  just  before  touch¬ 
down,  cannot  be  assigned  to  the  stressed  or  unstressed  group  with  any  degree  of  certainty,  at  least  not 
solely  on  the  basis  of  fundamental  and  first  formant  frequency  measurements.  These  conclusions  are  at 
variance  with  those  of  Simonov  et  al  (Ref  23)  who  have  claimed  much  greater  stress  induced  changes  in 
their  speech  parameters. 


It  is  apparent  that  seme  form  of  measure  on  fundamental  frequency  and  its  variations  is  essential  in 
an  analysis  of  stressful  situations,  various  measures  on  the  formants,  in  particular  the  formant 
frequencies,  appear  to  be  quite  promising. 
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FUNDAMENTAL  FREQUENCY  VARIATION  C*) 


Fig  5 

Summary  of  fundamental  and  formant  frequency  variations  during  22  commercial  landings 
SPEECH  PROCESSING  TECHNIQUES 

Speech  analysis  literature  provides  a  wealth  of  information  covering  many  areas  of  application,  all 
of  which  are  based  on  a  requirement  to  reduce  a  speech  signal  to  a  concise  set  of  parameters.  The 
applications  fall  into  two  categories. 

1.  Reduction  of  bandwidth  requirements  in  speech  coenunication  channels  and  the  automatic  machine 
generation  of  speech.  Neither  of  these  applications  is  relevant  to  tho  current  discussion. 

2.  Speech  recognition  ^plications.  This  broad  area  requires  recourse  to  statistical  and  pattern  recog¬ 
nition  techniques  to  classify  the  sets  of  speech  parameters.  Speech  recognition  can  mean  the  extraction  of 
phonetic  or  aeaiantic  information,  and  the  methods  which  have  been  developed  in  this  area  are  often  based 

on  prototype  or  template  speech  parameters  (Ref  27) .  In  the  context  of  stress  and  high  workload  analysis, 
the  techniques  of  speech  recognition  which  are  of  greater  interest  are  those  aimed  at  identifying  tho 
speaker  rather  than  his  speech.  Considerable  effort  has  been  expended  in  developing  means  of  assigning  the 
voice  parameters  of  an  arbitrary  speaker  to  a  specific  "library"  parameter  group.  It  is  possible  to 
identify  particular  speakers  from  relatively  large  populations  if  a  section  of  their  speech  is  available 
for  reference  purposes.  The  relevance  of  these  methods  is  obvious,  and  it  may  be  possible  to  assign  short 
sections  of  speech  frcjn  a  given  speaker  to  known  stressed  or  unstressed  parameter  groups.  It  may  also  be 
possible  to  develop  a  system  using  several  levels  of  stress,  rather  than  a  s tressed-unstressed  binary 
quantisation. 

K  description  of  all  the  available  processing  techniques  cannot  be  attempted  but  two  techniques  which  are 
felt  to  be  directly  relevant  In  the  analysis  of  stress  and  workload  will  be  considered.  The  use  of 
cepstrum  techniques  is  cotimon  and  so  the  implications  of  thess  methods  will  be  considered  in  some  detail. 

Cepstrum  Techniques.  Cepstrus  analysis  is  a  powerful  methodology  which  may  be  used  to  analyse  a  voiced 
speech  signal  by  separating  out  the  contribution  due  to  the  glottal  pulse  i.nd  the  contribution  due  to  the 
formant  filters.  It  Is  possible  to  Identify  voiced  and  unvoiced  intervals,  and  within  the  voiced  intervals 
vo  obtain  estimates  of  fundamental  frequency.  Further,  the  cepstrum  technique  can  provide  smoothed 
spectral  estimates  from  which  formant  information  can  be  obtained  (Ref  9,  10) .  Using  the  model  presented 
In  Fig  2  the  voiced  speech  output  signal  may  be  assumed  to  be  a  convolution  of  the  vocal  source  impulse 
train  with  the  impulse  responses  of  the  various  filters  in  the  system.  Thus,  denoting  convolution  with  a  *, 


x(t)  is  the  speoch  output 
p(t)  is  the  impulse  train 

s(t)  Is  the  Impulse  response  of  the  source  filter 
F (t)  is  tho  combined  Impulse  response  of  the  formant  filters 
r(t)  Is  the  impulse  response  of  the  radiation  load  filter. 


Combining  the  offsets  of  the  source  and  radiation  load  so  that  the  vocal  source  output  assumes  the  form 
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S(t)  -  p(t)  *  «(t)  *  r(t) 
than  x ( t)  -  S(t)  *  P(t) 

or  equivalently,  ainca  convolution  in  tha  tima  domain  is  identical  with  multiplication  in  tha  fraquar.ry 
domain, 

X(u)  -  S(u)  .  F (id) 

whara  X(u)  ia  tha  apaach  magnitude  spectrum . 

A  logarithmic  transform  has  tha  effect  of  separating  tha  alamanta  of  xtu)  into  additive  components,  ia 

Ln{X(u)}  »  Ln{S(u).  F(w)}  "  Ln{S(w)}  +  Ln{F(w)} 

Ln{x(u) }  has  tha  appearance  of  an  undulating  function  representing  formant  structure  with  a  superimposed 
"high  frequency"  ripple  representing  tha  harmonic  structure  of  the  vocal  source  spectrum.  Tha  additive 
componenta  in  Ln{x(w) }  are  maintained  during  inverse  frequency  transformation  which  results  in  the  ao 
called  capstrum.  Clearly,  the  harmonic  structure  in  the  log  magnitude  spectrum  manifests  itself  as  a 
sharp  peak  in  tha  capstrum  from  which  pitch  period  may  be  determined.  There  are  available,  several 
efficient  algorithms  which  implement  cepstral  pitch  peak  picking,  and  we  have  developed  an  algorithm 
baaed  on  a  design  by  Noll  (Ref  28),  which  has  proved  to  be  very  useful. 

This  technique,  then,  is  a  relatively  simple  method  of  measuring  fundamental  frequency,  and  is  based 
on  the  harmunic  structure  of  a  log  magnitude  spectrum.  As  a  consequence,  the  fundamental  frequency  com¬ 
ponent  nee<  not  be  present  in  the  signal  being  analysed.  Further,  our  experience  has  shown  that  the  method 
works  well,  even  in  the  presence  of  considerable  noise,  for  instance,  with  a  signal  to  noise  ratio  as  low 
as  5  dB,  (as  defined  only  during  voiced  intervals) .  In  this  context,  noise  refers  to  the  acoustic  noise  in 
the  cockpit  environment  as  well  as  to  iny  electrical  noise  introduced  by  the  communication  and  recording 
equipment. 

It  is  of  interest  to  compare  the  capstrum  method  with  othar  simple  pitch  extraction  routines. 

McGonegal  at  al  (Raf  30)  have  evaluated  cepstrum  methods  together  with  low  pass  filtering  and  auto¬ 
correlation  techniques.  The  autocorrelation  function  is  quite  similar  to  the  cepstrum  except  that  in  the 
latter,  pitch  peak?  are  more  pronounced  due  to  the  logarithmic  transform  in  the  spectrum.  With  the  excep¬ 
tion  of  identifying  ciced-unvoiced  transitions,  the  three  pitch  extraction  methods  ment  >ned  above  were 
shown  to  be  quite  s  juilar  in  operation. 

Cepstrum  analysis  however,  hat.  other  advantages  when  searching  for  parameters  to  characterise  the 
spectrum.  Since  the  log  magnitude  spectrum  and  cepstrum  are  Fourier  transform  pairs,  the  low  order 
coefficients  in  the  cepstrum  contain  spectral  envelope  shape  information.  This  observation  provides  us 
with  two  alternative  methods  for  obtaining  spectral  information. 

1.  The  low  order  cepstrum  coefficients  may  be  used  directly  as  parameters  which  classify  the  speech 
spectrum  . 

2.  The  cepstrum  may  be  short  time  filtered  and  transformed  back  into  the  log  magnitude  spectrum.  Formant 
picking  algorithms  may  then  be  implemented  to  characterise  the  spectrum. 

The  first  method  is  computationally  faster,  and  ha;t  found  extensive  use  in  talker  identification 
applications.  However,  the  second  method  is  less  prone  to  corruption  due  to  noise  in  the  original  speech 
waveform  and  is  physically  more  meaningful.  It  is  suggested  that  in  the  current  application,  the  second 
method  offers  tha  more  viable  proposition.  Within  the  framework  of  cepstrum  analysis,  there  are  several 
methods  of  extracting  formants.  Schafer  and  Rabiner  (Ref  10)  have  provided  a  robust  method  (peak  picking) 
which  makes  full  use  of  empirical  data.  Alternatively,  Olive  (Ref  29)  uses  the  model  of  speech  production 
in  an  iterative  spectrum  matching  technique  (analysis  by  synthesis) .  Both  of  these  methods  make  use  of 
amplitude  information,  but  this  is  not  practical  in  the  current  application.  We  have  had  some  success  using 
an  algr.ithm  based  on  .'chafer  and  Rabiner 1  s  design.  The  algorithm  disregards  amplitude  information  except 
fer  -f.iative  changes  w-thin  specific  formant  ranges,  and  currant  formant  peak  picking  decisions  are,  in  part, 
baset*  on  previous  decisions. 

At  this  stage  we  must  consider  the  choice  of  analysis  interval,  that  is  the  length  of  the  speech  opoch 
used  to  obtain  fundamental  and  formant  frequency  estimates.  Since  these  parameters  will  vary  during  an 
utterance,  the  analysis  interval  should  be  arbitrarily  short.  In  practice  of  course,  a  compromise  is 
necessary.  At  least  four  pitch  periods  are  desirable  to  obtain  a  strong  peak  in  the  cepstrun,  but  within 
this  time  it  is  quite  likely  that  one  or  more  of  the  formant  peeks  will  have  moved,  producing  a  smearing 
effect  in  the  spectral  envelope.  Generally  speaking,  the  analysis  interval  is  chosen  to  contain  up  to  four 
pitch  periods  (20-40  mS) ,  but  successive  intervals  are  overlapped,  to  have  centres  which  may  be  only  10  mS 
apart.  Individual  fundamental  and  formant  frequency  estimates  can  be  used  to  form  contours  or  profiles 
covering  a  complete  utterance. 

It  is  usual  to  implement  cepstrum  analysis  using  fast  Fourier  transform  (FFT)  methods,  either  in  hard¬ 
ware  or  software.  An  important  problem  which  is  closely  related  to  the  choice  of  analysis  interval  concerns 
the  choice  of  sampling  rate  and  FVT  transform  size.  Assuming  formant  information  is  required,  a  minimum 
sampling  rate  of  8  kHz  is  desirable:  10  kHz  is  more  usual.  For  fundamental  frequency  extraction  alone, 
lower  sampling  rates  may  be  used,  but  this  will  result  in  significant  quantisation  error.  It  can  be  shown 
that  time  resolution  in  the  cepstrum  Is  given  by 

tcr  -  ^Fs 

where  Fg  is  the  sampling  frequency. 

Consider  a  pitch  peak  at  the  nth  cepstrum  coefficient.  Fundamental  frequency  is  then  given  by 


fo(n)  -  l/(nTCR) 


Resolution  in  fo(n)  is  inversely  dependent  upon  n  and  so  for  a  given  sampling  frequency,  the  maximum 
quantisation  error  increases  as  fo(n)  increases.  This  is  illustrated  in  Pig  6  which  plots  cepstrally 
derived  fundamental  frequency  against  the  maximtm  quantisation  error,  AQ,  at  that  frequency.  At  the  nth 
cepstrum  coefficient  the  quantisation  error  is  defined  as 

fo  (n+1)  -  fo(n-l)  _  1 

- 2 - (n^-T)  t” 

CR 

Intuitively,  for  the  expected  changes  in  fo  induced  by  stress  and  high  workload  situations,  AQ  should  not 
exceed  2  Hz.  Thus  if  fo  does  not  exceed  ISO  Hz,  a  minimum  sampling  rate  of  8192  Hz  is  sufficient. 


FUNDAMENTAL  FREQUENCY  CHZ) 

Fig  6 

Effect  of  sampling  rate  on  fundamental  frequency  resolution 

Given  a  suitable  sampling  rate,  choice  of  transform  size  is  restricted  by  the  analysis  interval 
requirement.  But,  if  formant  picking  is  to  be  implemented,  a  large  transform  size  is  desirable  to  give 
good  resolution  in  the  spectrum  and  to  avoid  aliasing  problems  in  the  cepstrum. 

Now  B  -  F  /S 
a  S 

where  B^  is  the  spectral  resolution 

and  S  is  the  transform  size. 

Given  an  8192  Hz  sampling  rate,  a  minimum  transform  size  of  1024  points  should  be  used.  This  implies 
an  analysis  interval  of  0.125  seconds.  It  is  therefore  usual  to  pad  the  actual  analysis  interval  with 
zeros  for  transformation  purposes.  A  schematic  illustration  of  the  complete  cepstrum  analysis  procedure  ib 
given  in  Pig  7. 

Linear  Predictive  Coding.  Linear  predictive  coding  i'  a  form  of  inverse  filtering  which  models  the  speech 
waveform  itself  rather  than  various  aspects  of  the  speech  spectrum  (Ref  11,  31) .  The  contribution  of  the 
vocal  source  and  vocal  tract  to  the  speech  signal  are  not  separated  out  and  it  is  possible  to  track  rapidly 
changing  speech  processes  which  may  be  lost  in  the  relatively  long  analysis  intervals  associated  with 
Fourier  methods. 

In  ecsence,  during  the  segment  of  speech  to  be  analysed,  the  nth  speech  sample,  S  ,  is  given  as  a 
weighted  sum  of  the  previous  p  values.  n 


t 


The  weighting  coefficients,  a^,  can  be  obtained  by  first  calculating  the  prediction  error,  En,  as 

P 

E  —  S  -  S  ”  S  —  E  a.  S 
n  n  n  n  k  rv-k 

A  2 

whare  S  _  is  the  value  of  a  speech  sample  and  sn  is  its  predicted  value.  E  is  then  averaged 
over  all  n  in  the  current  speech  segment  to  Jorm  a  mean  square  prediction  error  vhiefi  is  minimised  by 
choice  of  the  a*.  The  number  of  coefficients  needed  to  represent  a  speech  segment  is  given  by  p,  and 
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Fig  7 

Stylised  representation  of  the  stages  in  a  cepstrum  analysis 

depends  on  the  model  chosen  to  represent  the  vocal  Bource  and  tract.  It  can  be  shown  that  twelve 
coefficients  are  generally  adequate.  Thus  for  a  speech  signal  digitized  at  lO  kHz,  the  lOO  samples  taken 
over  a  lO  mSec  analysis  interval,  may  be  represented  by  just  fourteen  parameters,  that  is  twelve  weighting 
coefficients  and  a  pitch  period,  together  with  a  binary  voiced-unvoiced  decision. 

Fundamental  frequency  estimates  are  obtained  easily  with  this  method  since  it  can  be  shown  that  the 
prediction  error,  En,  is  a  maximum  at  the  start  of  each  pitch  period  (Ref  11) .  A  simple  peak  picking 
procedure  may  be  used  on  the  Ejj  series  to  identify  the  points  in  the  speech  time  series  at  which  a  pitch 
impulse  occurs.  Peak  picking  is  independent  of  the  analysis  interval  and  so  very  short  intervals  (as  low 
as  5  mS)  may  be  used,  even  when  fundamental  frequency  information  is  required.  This  offers  distinct  advan¬ 
tages  over  cepstrun  methods  which  can  only  evaluate  an  average  fundamental  frequency  over  the  duration  of 
a  much  longer  analysis  interval.  Furthermore,  since  predictive  coding  is  a  time  domain  method,  it  is  sig¬ 
nificantly  faster  than  corresponding  frequency  domain  methods.  However,  when  using  predictive  coding  for 
pitch  extraction  it  is  desirable  that  the  pitch  fundamental  be  preeent  in  the  digitized  speech  signal. 

This,  together  with  the  poor  quality  speech  in  existing  communication  channels,  makes  linear  prediction 
less  attractive  in  the  current  application. 

The  advantage  of  using  very  short  analysis  intervals  is  of  more  interest  if  we  consider  formant  extrac¬ 
tion.  It  can  be  shown  that  the  weighting  coefficients,  a^,  define  the  poles  in  the  vocal  tract  transfer 
function,  and  it  is  possible  to  obtain  both  formant  frequencies  and  formant  bandwidths.  However  the 
problems  associated  with  poor  speech  quality  will  again  be  an  overriding  factor. 

STATISTICAL  TECHNIQUES 

In  the  previous  section  we  have  shown  the  way  in  which  speech  analysis  allows  parameters  such  as 
fundamental  and  formant  frequency  estimates  to  be  derived  during  a  short  time  period,  and,  it  has  been 
suggested  that  time  series  profiles  of  such  parameters  may  indicate  stress  or  high  workload.  It  is 
necessary  to  establish  the  validity  or  otherwise  of  this  hypothesis,  and  so  we  will  briefly  consider  seme 
of  the  statistical  techniques  which  may  be  used  to  classify  or  to  group  series  of  speech  parameters. 

The  first  question  concerns  the  length  and  nature  of  the  speech  epoch  to  be  analysed.  To  some  extent 
this  will  be  dependent  upon  the  type  of  the  statistical  analysis.  For  example,  the  examination  of  single 
words  or  single  phonems  is  only  feasible  under  a  limited  set  of  conditions.  The  same  phonon  or  word  must 
be  chosen  for  analysis  from  the  different  stages  of  a  flight  profile,  and  this  is  particularly  important  in 
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the  case  of  an  analysis  attempting  to  use  formant  information.  With  regard  to  fundamental  frequency,  even 
though  it  is  expected  to  remain  essentially  constant,  our  experience  has  suggested  that  in  the  course  of  a 
single  word  lasting  leas  than  one  second,  variations  due  to  intonation  mask  any  possible  change  due  to  the 
workload  situation.  This  problem  may  be  partially  overcame  by  restricting  the  choice  of  phonem  or  word  to 
those  which  may  be  obtained  from  similar  circumstances  during  the  flight  profile.  In  this  respect,  call- 
signs  represent  useful  information,  although  the  automatic  and  often  emotionless  manner  in  which  they  are 
uttered  may  or  may  not  be  an  advantage.  The  analysis  of  single  word  call-signs  is  being  actively  pursued. 

Problems  due  to  intonation,  varying  formant  structure  or  simply  random  variations,  may  be  overcome  by 
analysing  longer  sections  of  speech  and  averaging  the  resulting  parameters.  Although  a  simple  averaging 
procedure  is  a  valid  approach,  any  other  measure  which  classifies  the  shape  of  a  parameter  profile  should  be 
considered.  For  example,  moments  about  a  fundamental  frequency  mean  tore  useful  since  it  is  unwise  to  dis¬ 
regard  intonation  information  completely.  When  considering  such  methods,  the  question  of  a  suitable  length 
for  the  speech  epoch  arises.  Long  term  feature  averaging  experiments  (Ref  32)  have  suggested  that  a  mean 
fundamental  frequency  obtained  over  a  20  second  epoch  reduces  the  sample  variance  due  to  intonation  and 
random  variations  to  acceptable  levels.  In  this  context,  the  epoch  describes  20  seconds  of  voiced  speech 
which  represents  a  considerably  longer  section  of  normal  speech.  In  the  current  application,  it  is 
unlikely  that  such  lengthy  sections  of  speech  will  be  available.  It  should  also  be  noted  that  this 
methodology  derives  long  term  average  parameter r  ing  a  short  time  analysis  technique  and  results  similar 

to  those  obtained  using  the  spectrographic  me the .  described  in  section  4  are  to  be  expected.  It  is 
desirable  therefore  to  use  parameters  other  than  the  simple  average. 

The  above  discussion  suggests  that  the  voice  may  be  searched  for  signs  of  stress  of  high  workload  in 
terms  of  single  phonems  and  words  such  as  call-signs,  or  in  terms  of  the  properties  of  longer  sections  of 
speech.  In  either  case,  a  complete  data  set  may  be  viewed  as  a  series  of  vectors,  n  -  1,2  ....  m. 

Each  vector  represents  the  p  speech  parameters  which  are  chosen  to  characterise  a  particular  situation. 

Thus 

*n  “  £~xnl'  xn2 

In  the  case  of  single  phonem  analysis  the  components  in  x  may  represent  the  elements  of  a  pitch  or  formant 
profile,  while  for  longer  sections  of  speech,  the  components  in  X  will  represent  different  properties  of 
the  whole  speech  epoch.  The  m  vectors  are  obtained  at  various  stages  of  the  flight  profile,  and  it  would 
be  hoped  that  differences  in  the  structure  of  the  vectors  reflect  changes  in  the  stress  and  workload 
situation. 

Such  data  require  recourse  to  multivariate  statistical  methods.  Principal  component  and  factor 
analyses  (Ref  33)  are  obvious  candidates.  Such  methodB  reduce  the  dimensionality  of  the  vectors  and  as  a 
consequence,  can  be  used  to  demonstrate  possible  groupings  in  the  original  parameters.  If  consistent 
changes  can  be  produced  in  speech  parameters,  then  well  defined  stressed  and  unstressed  situations  will 
resolve  Into  two  distinct  groups  in  the  vector  apace.  Of  greater  importance  however,  is  the  fact  that 
these  methods  form  the  basis  of  techniijues  such  as  linear  discriminant  analysis  which  effectively  optimize 
the  ability  to  distinguish  between  different  groups  of  parameter  vectors.  These  methods  have  proved  useful 
in  speaker  identification  experiments,  using  parameters  extracted  from  long  sections  of  speech  (Ref  34) , 
and  are  felt  to  be  useful  in  the  current  application.  For  example,  the  data  presented  in  Fig  5  forms  the 
basis  of  a  discriminant  analysis  using  just  two  parameters:  the  inclusion  of  further  parameters  which  may 
provide  better  separation  of  the  centroids  is  desirable. 

Mathunatically,  discriminant  analysis  is  a  powerful  tool,  but  well  defined  "training”  data  sets  are 
required.  In  the  current  application  this  Implies  that  for  each  subject  considered,  it  is  necessary  to 
obtain  sections  of  speech  in  both  stressed  and  unstressed  situations.  If  such  data  can  define  distinct 
groups  then  It  is  possible  to  assign  an  arbitrary  sample  of  speech  to  one  of  the  two  groupe.  The 
reliable  collection  of  training  data  constitutes  a  major  problem,  particularly  in  the  stressed  situation 
which  is  not  easily  definable.  In  this  respect,  the  availability  of  other  physiologicel  data  is  of  some 
importance,  at  least  during  training  procedures.  Thus  for  the  data  presented  in  Fig  5,  the  stressed- 
unstressed  decision  was  based  partly  on  a  knowledge  of  the  workload  patterns  in  the  flight  profiles,  but 
mainly  ci.  the  measured  heart  rate  patterns  during  the  flight.  A  simple  correlation  analysis  batween 
physiological  data  and  speech  data  is  also  proving  valuable  in  establishing  which  speech  parameters  contain 
useful  information. 

CONCLUSIONS  AND  RECOMMENDATIONS 

The  salient  features  of  this  review  lead  to  recommendations,  which  may  constitute  a  methodology  for 
the  investigation  of  high  workload  using  speech  patterns . 

1.  The  aim  should  be  to  reduce  the  dimensionality  of  speech  and  provide  a  succinct  description  of  the 
data.  ThiB  must  be  done  in  a  way  which  preserves  the  possible  stress  or  high  workload  Information.  Also  a 
statistically  robust  method  must  be  found  to  classify  the  reduced  ipesch  data  into  at  least  two  groupe  of 
streseed-unstreesed  parameters . 

2.  The  nature  of  the  data  requires  some  attention.  It  is  suggested  that  an  atteeq?t  to  axtract  relatively 
simple  speech  parameters  from  many  flights  across  several  subjects  is  the  most  viable  approach.  In  the 
long  term,  a  complex  and  sophisticated  analysis  on  a  limited  set  of  data  may  not  be  profitable. 

3.  For  a  given  flight  profile  it  is  desirable  to  have  a  knowledge  of  the  likely  etress  or  high  workload 
patterns,  together  with  some  indication  of  their  rates  of  change.  This  information  will  influence  the  type 
and  length  of  speech  samples  which  may  be  used.  Thus  for  tepidly  changing  high  workload  profiles,  only 
single  words  or  phonems  may  be  used,  but  longer  sections  of  speech  may  be  employed  for  slowly  varying 
stress  patterns. 

4.  Our  experience  has  suggested  that  no  matter  what  type  of  speech  sample  is  chosen,  a  cepstrvn  analysis 
technique  offers  a  realistic  compromise  between  the  degree  of  processing  power  required  and  the  amount  of 
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information  preserved  in  the  reduced  data.  At  this  stage,  more  involved  processing  methods,  while 
possibly  being  more  powerful,  are  not  considered  to  be  worthwhile,  and  indeed,  would  not  offer  such 
reliable  results  given  the  poor  quality  of  the  original  speech  samples. 

5.  Capstrum  analysis  can  offer  short  time  smoothed  spectrum  and  formant  information  together  with 
fundamental  frequency  information.  Pitch  and  formant  profiles  over  longer  periods  of  time  may  be  easily 
constructed.  Previous  research  in  this  area  has  suggested  that  such  measures  are  of  considerable  value. 

6.  Whatever  statistical  methods  are  employed,  they  must  be  capable  of  assigning  arbitrary  speech  data  to 
setae  point  on  a  stressed-unstressad  scale.  Our  experience  has  suggested  that  initially,  only  a  binary 
quantisation  of  the  scale  may  be  possible.  in  any  event,  multivariate  methods  are  necessary  and  linear 
discriminant  analysis  looks  very  premising. 

7.  It  is  unlikely  that  an  absolute  estimate  of  stress  or  workload  can  be  obtained  from  a  single  speech 
sample  in  isolation. 

8.  simple  correlation  analysis  of:  speech  parameters  with  physiological  data  appears  to  be  a  realistic 
means  of  establishing  which  parameters  will  be  of  use  in  the  long  term. 

9.  Finally,  voice  micro-tremor  has  found  limited  uses  in  ccsmercial  devices,  but  a  rigorous  study  of  its 
possible  usage  in  the  current  application  should  not  be  neglected. 
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INTRODUCTION 

There  has  been  an  ever-increasing  concern  within  the  Federal  Aviation  Agency  for  the  possible  adverse 
effects  of  stress  inherent  in  the  character  of  the  work  of  Air  Tre'fic  Control  Specialists  (ATCS).  Some¬ 
one  has  characterized  the  job  of  the  pilot  as  Involving  "hours  of  routine  monotony  Interspersed  by  moments 

of  sheer  terror."  Perhaps  this  is  no  less  true  of  the  job  of  the  controller  whose  basic  task  is  to  main¬ 
tain  an  orderly  flow  of  air  traffic,  maintain  the  safe  separation  of  enroute  and  coverging  terminal 
traffic,  and  to  assist  the  pilot,  often  under  adverse  flying  conditions. 

As  pointed  out  by  Dougherty,  Tritus,  and  Dllle,  who  compared  health  information  between  ATSC  and 
non-ATSC  personnel,  those  who  are  engaged  in  this  particular  occupation,  as  well  as  external  observers  of 
the  Job  situation,  feel  that  there  it  inherent  stress  Involved  in  the  work  which  may  have  adverse  effects 
(1).  These  effects  undoubtedly  involve  Internal  stress  factors  such  as  fatigue,  aging,  and  job  experience, 
as  well  as  external  factors  of  potential  aircraft  conflict,  workload,  critical  incidents,  and  other 
aerospace  events.  The  major  concern  of  human  engineering  has  been  to  develop  command  and  control  systems 
wherein  better  displays  and  more  functional  controls  would  enable  the  controller  to  better  perform  his 
demanding  task  and  ultimately  render  it  less  stressful.  Basic  to  this  concern  has  been  an  attempt  to 

define  the  controller's  task  and  to  identify  certain  aerospace  events,  such  as  number  of  aircraft,  aircraft 

speed,  control  sector  size,  etc.,  which  may  be  c .uclal  factors  in  the  controller's  job  performance  (2,  3). 
However,  such  studies  have  served  only  to  point  out  that  the  real  need  in  evaluating  the  efficiency  of 
control  systems,  or  of  the  operator  himself,  is  the  establishment  of  relevant  criterion  measures.  Studies 
in  this  area,  to  date,  have  demonstrated  that  simple  measures  of  various  aerospace  events  which  comprise 
the  controller's  workload  do  not  fully  relate  to  the  complex  stresses  that  are  experienced  in  the  Job 
performance. 

Since  external  job-related  measures  do  not  offer  satisfactory  criteria,  we  have  turned  to  Internal 
operator-related  measures  in  an  effort  to  determine  their  usefulness  in  evaluating  the  stressors  Inherent 
in  the  work  of  the  ATCS.  Therefore,  this  study  was  designed  to  explore  the  possibility  that  certain 
physiological  measures  could  be  related  to  some  aspects  of  the  controller's  task,  namely,  worklc.-d  defined 
in  terms  of  number  of  aircraft  (traffic  density),  and  the  occurrence  of  aircraft  conflicts. 

PROCEDURE 

Stimulus  Materials;  Since  the  research  goal  was  to  determine  if  selected  physiological  variables  were 
related  to  control ‘.j*  workload,  the  stimulus  materials  were  selected  to  provide  two  extremes  of  work  level. 
This  was  done  by  simulating  a  FFI  (Plan  Position  Indicator)  display  cf  an  enroute  sector  by  means  of 
specials  films  using  the  "CODE"  (Controller  Deicislon  Evualtlon)  technqlue  developed  by  one  of  the  authors 
(4).  One  film  presented  a  traffic  pattern  of  low  density,  i.e.,  few  aircraft  and  few  conflicts.  The  other 
film  presented  a  high  density  traffic  pattern,  i.e.,  many  aircraft,  many  more  conflicts,  and  higher 
aircraft  speeds.  An  aircraft  conflict  is  defined  in  terms  cf  aircraft  in  flight  that  approach  each  other 
in  such  a  manner  so  as  to  violate  established  separation  criteria.  This  is  *  potential  collision  situation 

The  problems  were  approximately  40  minutes  in  duration.  The  low  density  sample  had  an  average  of  6.6 
aircraft  under  control  at  a  given  time.  The  average  aircraft  speed  was  470  knots  with  a  range  of  380 
knots  to  550  knots.  The  number  of  conflictions  occurring  during  the  problem  was  four. 

The  high  density  sample  had  an  average  ox  19.4  aircraft  under  control  at  a  given  time.  The  average 
aircraft  speed  was  476  knots  with  a  range  of  346  knots  to  566  knots.  The  number  of  conflictions  occurlng 
during  the  problem  was  16. 

Subjects:  Ten  subjects,  all  Air  Traffic  Control  Specialists,  were  selected.  Their  chronological  age 
range  was  from  29  to  46  years  with  a  mean  of  35.7.  Their  experience  as  controllers  ranged  from  7  to  18 
years  with  a  mean  of  11.5. 

Instructions:  In  crder  to  standardize  the  subject's  approach  to  the  experimental  task,  the  following 
instructions  were  read  to  each  man  individually: 

You  are  being  asked  today  to  take  part  in  an  experiment  concerning  controller 
workload  and  certain  related  physical  changes.  He  have  developed  a  film  showing 
traffic  on  an  enroute  sector.  The  picture  will  change  every  six  seconds  to  give 
an  approximation  of  a  six  second  sweep  on  a  PPI  scope.  Some  of  the  aircraft 
develop  conflictions  in  the  film.  Your  task  will  be  to  discern  when  conflictions 
are  developing  and  to  Indicate  this  fact  by  pressing  the  button  on  your  left  and 
noting  the  identities  of  the  aircraft  in  confliction  on  the  sheet  before  you. 

The  number  of  alrcrsft  you  will  be  responsible  for  may  be  unusually  high  and  be 
going  unusually  fast — but  do  what  you  can.  There  are  two  sectors  displayed.  You 
are  responsible  for  the  conflict  detection  task  for  both  sectors.  Your  separation 
standard  is  5  miles  and  1,000  feet  for  the  total  area.  YOUR  PERFORMANCE  IS  SCORED 
ON  HOH  ACCURATELY  YOU  PERFORM.  Avoid  reporting  too  soon  as  this  may  be  only  a 
potential  conflict.  Avoid  reporting  too  late — prevention  action  could  not  be 
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taken.  Both  of  these  factors  will  be  considered  errors  In  determining  your  score. 

The  records  of  your  performance  are  for  research  purposes  only  and  will  not  be 
divulged  to  anyone  for  any  other  purpose. 

Note  that  you  are  asked  for  two  things:  (a)  to  press  the  button  on  your  left  at 
the  time  which  you  would  normally  Instruct  one  of  the  pilots  to  take  preventive 
action,  (b)  note  the  identities  of  the  aircraft  involved. 

All  aircraft  in  the  system  are  identified  by  alphanumerics .  Information  on 
altitude  and  changes  in  altitude  is  also  given.  Frame  6  (sample  display)  shows 
how  this  information  will  be  given  throughout  this  flic.  The  scale  of  the  map 
is  in  the  upper  left. 

Are  there  any  questions? 

Test  Schedule!  All  subjects  were  scheduled  for  one  session  during  which  they  monitored  the  display  and 
performed  the  required  task  of  conflict  identification  for  both  the  high  and  low  density  traffic  patterns. 
The  order  of  presentation  of  either  the  high  or  low  density  stimulus  was  counterbalanced  across  subjects 
to  rule  out  order  effects.  Each  traffic  pattern  was  monitored  for  40  minutes.  Thus,  the  subjects  were 
instructed,  instrumented,  calibrated,  and  then  monitored  either  the  high  or  low  film  for  40  minutes.  Then 
they  had  a  10-minute  break  during  which  time  the  film  was  changed.  Then  followed  another  40-minute  sessi  n 
using  the  alternate  film. 

Physiological  Measurements:  Two  physiologic  measures  were  selected  as  dependent  variables:  Heart  Rate 
and  Psychogalvanic  Skin  Response,  also  called  the  Galvanic  Skin  Response  (GSR).  Heart  rate  was  recorded 
via  two  electrodes  atcached  at  conventional  locations  on  the  chest  approximately  6.5  cm  above  and  below 
the  left  nipple  on  the  mid-clavicular  line. 

GSR  was  recorded  from  electrodes  attached  on  each  hand  in  the  fleshy  area  commonly  called  the  "heel." 
Automatically,  this  area  can  be  described  as  laying  over  the  fourth  metacarpal  bone  and  about  2  cm.  to 
the  ulnar  side  of  the  palmar  aponeurosis.  Prior  experience  has  demonstrated  that  excellent  GSR  responses 
can  be  obtained  at  this  site  without  the  movement  artifacts  usually  associated  with  a  central  palmar 
location.  In  fact,  our  subjects  were  above  to  write  without  producing  notlcable  artifacts. 

The  electrodes  were  of  local  manufacture  using  silver-silver  chloride  material  lk  cm.  in  diameter 
and  mounted  in  a  plastic  cup  measuring  approximately  3  cm.  in  diameter.  The  electrolyte  used  was  EKG-SOL 
(Beck-Lee  Corporation).  The  sites  were  prepared  by  sponging  with  acetone  and  the  electrodes,  filled  with 
the  electrolyte,  were  attached  by  Eastman  910  adhesive. 

Leads  from  this  sensor  were  fed  into  appropriate  couplers  of  an  E.  and  M.  Physiograph.  Both  heart 
rate  and  GSR  were  condenser-coupled.  Read-out  was  continuous  at  120  mm.  per  minute. 

Subject  calibration  for  GSR  determinations  was  done  by  the  "sniff"  method.  Prior  to  starting  the 
experimental  run,  the  subjects  were  requested  to  sniff  (a  rapid  inhalation  through  the  nose)  at  *s  minute 
intervals,  the  GSR  amplifier  was  adjusted  to  yield  a  20  mm.  per  excursion.  When  the  subject's  "sniff" 
response  stabilized  at  this  level  the  session  began.  In  order  to  maintain  the  GSR  at  a  "standard"  level 
throughout  the  experimental  session,  the  subjects  were  requested  to  give  a  sniff  response  every  minute. 

By  this  method,  each  subject’s  GSR  could  be  calibrated  and  maintained  throughout  the  experiment. 

Data  Reduction:  Heart  rate  was  scored  on  a  minute-to-minute  basis  and  an  average  determined  for  each 
experimental  session.  GSR  was  evaluated  in  two  ways.  First,  total  GSR  frequency  was  determined  in  terms 
of  a  "standard  unit."  This  standard  unit  was  arbitrarily  selected  as  a  5  x  5  mm.  square  area.  All  GSR 
responses  which  were  less  than  this  size  area  were  Ignored.  The  records  were  then  hand  scored  in  terms  of 
this  standard  unit.  In  order  to  check  on  the  accuracy  of  the  hand  scoring,  GSR's  ftom  each  record  were 
randomly  selected  and  the  area  scored  by  means  of  a  planlmeter  (Keuffel  and  Esser  4211).  By  this  means 
the  hand  scoring  was  determined  to  be  in  error  less  than  one  percent.  Hand  scoring  for  area  is  a  laborious 
procedure  but,  in  the  absence  of  an  electronic  Integrator,  reasonable  accuracy  can  be  obtained,  although 
we  are  not  advocating  the  procedure. 

Statistical  Analyses:  The  means  and  medians  for  the  high  and  low  density  situations  were  computed  for 
all  four  measures.  Both  j irametric  and  non-parametrlc  tests  for  the  statistical  significance  of  differences 
were  done  for  all  four  measures.  The  matched  pairs  't'  test  was  the  ^irametrlc  test  used.  The  arcsin 
transformation  was  used  before  applying  the  't'  test  to  the  percentages  of  confllction  detection.  The  sign 
test  was  the  non-parametrlc  test  used.  One  tailed  tests  were  used  in  all  cases. 

RESULTS 

Establishment  of  Difference  Between  Traffic  Samples:  The  confllction  detection  performance  is  shown  in 
Table  I.  That  there  was  a  very  significant  difference  in  the  two  traffic  samples  shown  is  indicated  in 
the  significant  difference  in  the  confliction  detection  performance.  This  establishes  the  fact  that  we 
were  in  fact  dealing  with  traffic  samples  which  were  markedly  different  in  difficulty. 

The  fact  that  some  confliction  detections  were  missed  in  the  heavier  traffic  sample  shows  that  it 
was  probably  unrealistically  difficult,  in  accordance  with  the  rationale  stated  earlier  for  this  pilot- 
study,  of  using  two  markedly  different  conditions  to  examine  the  physiological  measures.  In  addition, 
the  fact  that  there  were  considerable  overwrites  among  the  alphanumerics  on  the  film  undoubtedly  markedly 
Increased  the  difficulty  of  detections.  In  general,  then,  the  reader  is  cautioned  not  to  regard  these 
percentages  of  conflict  detections  as  operationally  valid  but  rather  to  keep  in  mind  the  rationale  for 
this  test  as  an  exploratory  study. 


TABLE  I 


CONFLICT ION  DETECTION  IN  LOW  TRAFFIC  DENSITY  AND 
RICH  TRAFFIC  DENSITY  SITUATION  FILMS 


Sublect 

Low  Density  (Z) 

Hlah  Density  (Z) 

1 

100 

56 

2 

100 

75 

3 

100 

62 

A 

100 

44 

5 

100 

75 

6 

75 

50 

7 

100 

69 

8 

100 

69 

9 

75 

69 

10 

100 

69 

Mean 

95 

64 

Median 

95 

69 

Sign  teat  p  »  .001  t  (transformed  data)  p  «  .0005 

t  (untransformed  data)  p  “  .0005 

Evaluation  of  Physiological  Measures;  The  major  purpose  of  the  study  was  to  Bee  whether  certain  physio¬ 
logical  discriminated  between  two  traffic  samples  of  different  difficulty,  i.e.,  whether  they  would 
reflect  workload. 

Tables  II,  III,  and  IV,  present  the  data  from  three  measures:  Heart  rate,  GEL  frequency,  and  GSR 
area,  respectively.  It  Is  clear  from  the  tables  that  the  best  discriminator  Is  GSR  area;  that  the  GSR 
frequency  measure  is  of  moderate  definitiveness  as  a  discriminator;  and  that  heart  rate  Is  least  effective 
although  discriminating. 


TABLE  II 

HEART  RATE  DURING  LOW  TRAFFIC  DENSITY  AND  HIGH 
TRAFFIC  DENSITY  SITUATION  FILMS 


Subject 

Low  Density 

High  De  isity 

1 

96.68 

99.35 

2 

72.48 

71.39 

3 

90.88 

119.37 

4 

90.05 

85.64 

5 

99.40 

105.23 

6 

105 . 88 

96.58 

7 

63.49 

75.51 

8 

79.47 

79.86 

9 

65.03 

67.66 

10 

88.09 

102.99 

Mean 

85.14 

89.56 

Median 

89.07 

91.11 

Sign  test  p  ■  .055 
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t  test  p  “  .10  -  .05 
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TABLE  III 

GSR  FREQUENCY  DURING  LOW  TRAFFIC  DENSITY  AND  HIGH 
TRAFFIC  DENSITY  SITUATION  FILMS 


Sub 1 eel 

Low  Density 

High  Densll 

1 

107 

110 

2 

64 

72 

3 

36 

75 

4 

46 

112 

5 

51 

99 

5 

77 

76 

7 

59 

82 

8 

109 

92 

9 

51 

97 

10 

71 

128 

Mean 

67.1 

94.3 

Median 

61.5 

94.5 

Sigu  test  p 


.055 


t  test  p 


.01 - .005 


TABLE  IV 


GSR  AREA  DURING  LOW  TRAFFIC  DENSITY  AND  HIGH 
TRAFFIC  DENSITY  SITUATION  FILMS 


Sublect 

Lew  Density 

High  Density 

1 

198 

433 

2 

134 

279 

3 

119 

207 

4 

88 

296 

5 

146 

199 

6 

221 

478 

7 

150 

206 

8 

197 

234 

9 

164 

272 

10 

196 

324 

Mean 

161.3 

292.8 

Median 

157.0 

275.5 

Sign  test  p  • 

.001 

t  test  p  -  .0005 

Thin  pilot  study,  then,  within  its  limitations,  has  accomplished  its  purpose  of  determining  whether 
extensive  examination  of  this  type  of  measure  was  warranted  since  it  has  shown  that  psychophysiological 
measurements  could  at  least  discriminate  the  human  reaction  to  traffic  situation  which  were  known  to  differ 
widely  in  difficulty. 

DISCUSSION  AND  CONCLUSIONS 

The  recider  should  be  aware,  on  the  one  hand,  that  the  traffic  situations  portrayed  here  were  different 
in  difficulty  in  the  extreme  and  that  these  physiological  Btress  measures  may  not  be  as  successful  in 
discriminating  the  human  effort  differences  associated  with  more  normal  sector-to-sector  or  hour-to-hour 
variations  in  traffic.  On  the  other  hand,  if  further  studies  confirm  the  results  obtained  here,  a  tool 
for  systems  research  and  development  of  considerable  Importance  has  been  found.  As  only  one  example  of 
this  utility,  this  methodology  could  be  useful  to  verify,  refine,  and  Improve  the  recent  formulation  by 


Arad  of  a  mathematical  Index  of  the  complexity  of  airspace  events  (2,  3).  Another  obvious  use  Is  as  a 

criterion  for  new  systems  which  may  have,  as  one  of  their  values,  a  reduction  In  controller  effort, 

fatigue,  and  stress. 

The  study  has  had  another  outcome,  Important  in  the  technology  of  psychophyslologlcal  measurement. 

As  previously  noted,  GSR  changes  in  the  subjects  were  more  detectable  using  variations  In  measured 
amplitude  area,  as  compared  to  frequency  of  GSR  changes.  While  our  methods  of  evaluating  GSR  area  were 
laborious  due  to  equipment  limitations,  the  availability  of  integrating  methods  for  automatically  yielding 
measures  of  amplitude  change  should  yield  Important  data  in  evaluating  the  GSR  relative  to  workload  and 

other  stress  studies.  In  this,  our  work  appears  to  closely  parallel  a  Russian  study  (5)  where 

Koxarovitakii  reports:  "In  inexperienced  airline  dispatchers  the  galvanic  skin  responses  deviated  t.jm 
normal  due  to  fatigue  at  the  end  of  a  working  day  or  under  tension.  The  character  of  the  tracings  showed 
diminishing  skin  resistance,  either  an  Increase  or  a  decrease  in  amplitude  in  different  individuals,  and 
a  decrease  In  frequency.  Distraction  of  attention  tended  to  lower  the  akin  resistance  which  depends 
(on)  stimulating  and  suppressing  processes."  He  also  points  out  that,  "Experienced  subjects  showed  fewer 
sharp  variations,  less  fluctuation  in  amplitude,  a  faster  fading  of  the  response,  and  leas  reaction  to 
the  extraneous  stimuli." 

While  this  study  was  of  limited  scope,  it  seems  that  the  study  of  physiological  parameters  of  ATCS 
workload  may  yield  Important  criterion  measures  of  external  factors  of  aircraft  conflict,  task  overlnau, 
critical  Incidents,  and  other  aerospace  events.  Future  studies  will  explore  other  physiological 
variables  for  use  as  criterion  measures,  as  well  as  tools  for  the  evaluation  of  internal  stress  factors 
in  relation  to  fatigue,  aging,  and  traffic  control  experience. 
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This  la  a  simulation  study  to  examine  the  relatlonahlpa  between  field  air  traffic  controllar  perfor¬ 
mance  Indices  and  system  performance  measures.  The  study  encompassed  performance  criteria  developed 
within  two  distinct  environments,  the  controller's  home  facility  where  he  controlled  live  traffic,  and 
a  specially  designed  microsystem  or  "ona-aan  ATC  system"  with  simulated  traffic.  This  microsystem 
simulation  was  done  at  the  National  Aviation  Facilities  Experimental  Center.  Thus,  the  experiment 
represented  a  comparative  examination  of  several  quantitative  measures  of  system  functioning  derived  from 
air  traffic  control  simulation  and  an  Investigation  of  Chase  measures  as  lndlcee  for  the  objective 
evaluation  of  the  Individual  air  traffic  controller. 

The  Initial  Impetus  for  this  study  arose  from  a  concern  over  the  relationship  of  age  and  experience 
to  controller  proficiency.  To  a  large  number  of  controllers  in  the  field,  there  appeared  a  definite 
trend  for  older  men  to  be  unable  to  adequately  handle  the  new  and  complex  demands  related  to  the  Increasing 
pace  In  the  air  traffic  control  system.  While  this  question  of  age  versus  proficiency  formed  the  initial 
experiment,  the  basic  Interest  lies  in  the  necessity  of  the  FAA  to  maintain  a  highly  competent  workforce 
of  air  traffic  control  specialists.  The  central  assumption  la,  that  in  order  to  evaluate  the  effects  of 
age  or  any  other  variable  upon  air  traffic  controller  performance,  we  must  first  develope  and  validate 
an  objective  and  reliable  criterion  of  performance  that  has  known  relationships  with  controller  task 
functions.  Until  the  establishment  of  such  criteria  any  question  such  as  the  matter  of  age  versus 
proficiency  could  only  be  appraised  In  terms  of  Indirect  or  anecdotal  measures.  Another  aspect  of  this 
study  was  that  simulation  which  had  been  designed  and  utilized  to  study  procedural  and  system  differences 
had  never  been  employed  for  the  assessment  of  performance  differences  associated  with  the  Individual 
controller.  Through  the  use  of  simulation  In  this  manner  u  technique  could  be  developed  which  would  permit 
the  evaluation  of  each  controller  operating  his  test  sector  as  a  "micro  ATC  system"  and  utilize  the  system 
to  measure  varying  system  load  levels  and  related  controller  behavior/efficiency . 

Thirty  six  (36)  journeymen  enroute  air  traffic  controllers  served  as  subjects  having  been  chosen  as 
a  randomized  stratified  sample  of  the  personnel  from  four  enroute  air  traffic  control  centers.  The 
controllers  were  brought  to  the  NAFEC  center  for  one  week  in  groups  .  four  to  receive  an  orientation 
to  the  simulation  task  Including  the  fictitious  geographical  area  which  was  to  be  simulated  and  the 
traffic  control  local  procedural  rules  which  were  to  be  in.  effect  In  this  sector.  Each  subject  performed 
traffic  control  during  six  one-hour  runs,  with  two  runs  ar  each  of  three  traffic  densities,  an  experimental 
protocol  designed  to  produce  scores  which  would  measure  Individual  rather  than  team  performance.  In 
addition,  each  subject  was  tested  on  an  abbreviated  simulation  method  called  CODE,  which  stands  for 
each  subject  was  tested  on  an  abbreviated  simulation  method  called  CODE,  which  stands  for  Controller 
Decision  Evaluation.  During  the  main  simulation,  various  performance  measures  derived  from  counts  or 
timing  of  events,  for  example,  the  number  of  aircraft  delayed,  the  delay  time,  and  so  forth,  were  taken. 

In  addition  to  these  system  performance  measures,  stress  sensitive  measurements  of  physiological  functions 
were  obtained  under  this  dynamic  simulation.  In  addition  to  the  physiological  variables  of  heart  rate 
and  galvanic  skin  response  (GSR)  measurements,  a  number  of  psychological  measures  were  obtained  through 
the  use  of  the  16  PF  Teat. 

The  conclusions  based  upon  the  results  of  this  project  are  ss  follows: 

1.  The  current  chronological  age  of  the  36  subjects  ranging  in  age  from  31  to  45  years  possessed 
weak,  negative  relationships  with  indices  of  controller  proficiency  In  both  field  ratings  and 
simulation  performance  measurements.  Age  alone,  within  the  range  studied,  is  not  a  very  good 
performance  predictor. 

2.  Controller  age,  modified  In  various  ways  by  experience,  does  hsve  some  effect  on  performance. 

This  age  effect  operates  In  the  direction  of  greater  caution  and  safety,  with  tendency  toward 
delay  of  traffic.  However,  there  are  wide  Individual  differences  within  age  groups  and  con¬ 
siderable  overlap  In  proficiency  indices  between  age  groups. 

3.  Current  age  and  age  at  entrance  on  duty  were  highly  correlated  in  this  journeymen  level  group. 

At  the  journeyman  level,  differences  apparently  due  to  current  age  many  in  fact  be  due  to 

age  at  entrance  on  duty. 

4.  The  personality  scale  scores  based  on  the  16  PF  scales  have  an  unusually  large  number  of 
statistically  significant  relationships  wl  :h  controller  performance.  This  suggests  that  the 
controllers  task  which  requires  sustained  performance  under  complex  circumstances,  makes  such 
stressful  demands  as  to  Involve  his  total  personality  as  well  as  his  skills.  The  use  of  these 
scores  a o  predictors  of  controller  efficiency  is  not  validated.  The  16  PF  tests  reflected  that 
superior  controller  performance  might  be  linked  with  the  following  characteristics:  freedom 
from  depression,  lack  of  timidity,  socially  realistic  and  relaxed  with  a  relative  absence  of 
tenseness  or  anxiety. 


*  Abstracted  with  permission  of  the  senior  author  from  final  report  No.  NA  69-40,  Federal  Aviation 
Administration,  September  1969,  by  Richard  E.  McKenzie,  Ph.D. 
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5.  In  terns  of  physlolotlcal  measurements,  the  controller's  heart  rate  rieee  in  a  manner 
corresponding  to  increases  in  the  level  of  traffic  end  the  heart  rate  levels  observed  are  probably 
indicative  of  the  stress  inducing  nature  of  his  task.  Unfortunately,  the  GSR  measures  did  not 
correspond  to  the  previoualy  reported  exploratory  study,  however,  this  may  be  due  to  the  method 

of  evaluation  of  to  the  method  of  GSR  collection  which  was  obtained  by  foot  electrodes  as  opposed 
to  palmar  surface  electrodes. 

6.  Simulation  system  performance  measures  are  reliable  and  sufficiently  precise  to  measure 
Individual  differences  in  controller  proficiency. 

7.  Simulation  measures  and  fiald  indices  of  controller  performance  possess  sufficient  overlap  to 
establish  a  meaningful  correspondence  between  the  simulation  teat  environment  and  the  live 
traffic  anvironmant.  Stimulation  tachnology,  then,  la  capable  of  providing  reliable  and 
objective  measurasMnts  of  controller  proficiency. 

8.  Part  measuraa  of  the  controllers  task  using  tha  CODS  technique  appears  to  be  a  good  objective 
meaeure  of  cartain  fundamental  controller  abilities  which  warrants  further  development. 


9.  Results  from  factor  analysis  indicata  that  nine  specific  system  performance  criteria  are 
sufficient  to  describe  system  functioning  over  the  r*iag«  of  traffic  studied. 

10.  An  index  which  represents  the  quantification  of  a  trade-off  function  between  volume  handling 
capacity  end  the  occurence  of  delays  to  aircraft  celled  Rdv  offers  promise  as  a  measure  of 
system  load.  This  index  appears  suitable  for  utilisation  in  both  the  live  and  simulation 
system  environments  for  assessing  workload  difference*  associated  with  various  sector  configu¬ 
rations,  staffing  patterns,  and  different  geographical  aru.i.  The  index  Rdv  is  mathematically 
defined  as  the  correlation  between  delays  and  volume  in  terms  of  number  of  aircraft  handled  at 
a  given  traffic  level. 

Tk  <*hout  the  history  of  attempts  to  evaluate  individual  and  lystem  performance,  factors  nimulatlon 
techniqi  nd  psychological  teats  have  not  always  proven  effective.  In  this  study  we  aee  that  a 
slmulatio.  system  can  yield  measures  of  controller  proficiency  and  fiat  at  least  some  psychological  test 
scores  can  depict  superior  controller  characteristics.  Since  scores  oit  the  16  PF  acalaa  were  correlated 
with  both  the  simulation  system  performance  measures  and  the  firV  r.itlng  measures,  further  exploration 
of  this  test  as  a  predictor  of  successful  controller  qualities  or  aa  a  possible  method  of  evaluating 
decrements  in  the  performance  of  career  air  traffic  controllers  se'-ms  indicated. 


6 

fk 


i 


r 


137 


WORKLOAD  AND  STRESS  IN  AIR  TRAFFIC  CONTROLLERS 
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Abstract 


Date  collected  at  14  air  traffic  control  facilities  regarding  air  traffic  controller  (ATCS)  workload 
and  uricarv  stress  indicator  hormone  (SIH)  . xcretion  la  reviewed.  The  data  show  a  significant  relationship 
betweei:  objective  workload  measures  (radio  transmission  time  and  traffic  counts)  and  Indexes  of  cate¬ 
cholamine  excretion.  Mean  epinephrine  excretion  by  ATCSs  at  six  air  traffic  control  towers  ranging  from 
very  low  to  very  high  traffic  density  was  significantly  (R  ■  0.96)  related  to  annual  traffic  counts  at 
those  towers.  The  sympatho-adrenomedullary  axis  that  prepares  the  organism  for  "fight  or  flight"  described 
by  W.  B.  Cannon  in  1929  apparently  is  applicable  to  ATCSe.  The  question  of  underload,  optimum  food,  and 
overload  is  discussed. 

I.  Introduction.  The  workload  experienced  by  air  traffic  cont:. 'lives  (ATCS)  is  difficult  to  define. 

One  may  consider  imposed  load  objectively  in  terms  of  numbers  of  (raft  hrndled,  but  the  subjective  load 
perceived  by  the  controller  may  be  a  greatly  different  quantity. 

Many  factors  may  oparate  as  workload  modifiers  either  making  the  v~  k  easier  or  more  difficult: 

(1)  Type  of  traffic  handled.  One  aircraft  in  distress  may  cause  more  ’work"  than  all  the  other  traffic 
being  handled.  (2)  Weather.  Controllers'  perceived  workload  always  ine. eases  when  pilots  cannot  maintain 
visual  separation  in  instruments  meteorological  conditions.  (3)  Equipment  outages  and  malfunctions  causing 
reversion  to  manual  methods  ^f  control.  (4)  Disruption  of  circadicn  rhythms  -Bused  by  rotating  shifts, 
and  (5)  General  physical  and  motional  condition  resulting  from  a  variety  of  off-duty  activities  and  on- 
duty  problems  with  management  or  peers. 

It  is  perceived  workload  that  gives  rise  tc  the  poorly-defined  entity  known  as  stress.  Excessive 
stress  has  generally  been  assumed  to  be  a  component  of  air  traffic  control  work  and  has  been  legally 
recognized  as  such  in  Public  Law  92-297  which  provides  full  retirement  for  controllers  over  50  years  of 
age  after  20  years  of  work  controlling  air  traffic. 

Estimates  of  stress  in  ATCSs  are  rendered  difficult  because  of  the  interaction  of  off  di  and  on- 
duty  experiences.  The  ATCS  undoubtedly  brings  off-duty  problems  to  work  with  him  and,  just  as  certainly, 
takes  home  with  him  concerns  connected  with  the  work  place.  Thus,  a  complete  representation  of  stress  in 
ATCSs  must  Integrate  all  aspects  of  Che  ATCS's  life. 

There  has  been  a  string  tendency  in  the  popular  press  to  describe  stress  in  ATCSs  in  terms  of  con¬ 
ditions  at  "hot  spot"  facilities  such  as  O'Hare  and  Atlanta  Air  Traffic  Control  Towers  (ATCT) .  General¬ 
izations  from  these  deact  ptlons  give  a  skewed  idea  about  stress  in  the  entire  population  of  ATCSs,  most 
of  whom  work  in  facilities  with  far  fewer  operations. 

For  the  last  .0  years  this  laboratory  har  carried  out  studies  aimed  at  providing  a  general  description 
of  stress  in  ATCSs.  These  studies  have  encompassed  several  variables  including  numbers  of  air  traffic 
operations,  shift  rotation  effects,  automation,  different  kinds  of  air  traffic  control  (ATC)  work,  and 
geographical  distribution.  This  report  represents  an  attempt  to  provide  a  general  concept  of  stress  and 
workload  in  ATCSs. 

II.  Methods.  Estimates  of  stress  were  derived  primarily  from  urine  biochemical  analysis  for  17-ketogenic 
steroids  (17-KGS),  epinephrine  (E) ,  and  norepinephrine  (NE) .  In  most  cases  values  for  these  stress 
indicator  hormones  (SIH)  are  expressed  as  creatinine  (CR)-besed  ratios  (wt  SIH/100  mg  CR) .  Urine  analysis 
for  SIH  was  carried  out  as  previously  described  (1).  Urine  collected  at  field  facilities  was  frozen  at 
the  work  site;  when  a  sufficient  number  of  specimens  had  accumulated,  they  were  shipped  to  the  Civil  Aero- 
medical  Institute  (CAMI)  by  air  freight.  Upon  receipt  the  specimens  were  placed  in  a  freezer  where  they 
were  kept  until  analyzed.  Specimens  were  in  transit  for  3-5  h  and  detectable  thawing  did  not  occur. 

Subjects  were  all  volunteer  male  air  traffic  control  specialists.  They  were  Instructed  to  void  and 
discard  urine  Just  prior  to  retiring  the  night  before  a  workday.  They  were  then  to  collect  all  urine 
voided  until  they  arose;  normally  there  was  only  one  voiding  end  that  one  upon  arising.  They  were  then 
instructed  not  to  collect  urine  until  they  arrived  at  work.  At  work  they  were  told  again  to  void  and 
discard  and  to  collect  in  one  container  (or  two,  if  that  one  became  full)  all  urine  subsequently  voided 
during  the  workday.  ATCS9  then  repeated  this  collection  regimen  for  various  periods  of  time  depending 
on  the  facility  being  studied.  In  some  studies  urine  was  collected  for  a  whole  5-day  workweek;  in  others, 
urine  vr.s  collected  for  2  days  only  because  of  changeable  shift  patterns.  Each  24-h  rest-work  period  was 
represented  by  two  specimens. 

Urine  was  collected  in  cuboidal,  plastic  1-quart  receptacles  containing  an  excess  of  dry  boric  acid 
as  a  preservative.  Whun  ATCSs  delivered  the  containers  to  the  technical  crew  the  containers  were  labeled, 
logged,  and  frozen. 

Workloads  were  estimated  in  two  ways.  One  involved  the  recording  of  all  radio  transcicaionc  received 
by  and  coming  from  the  subject  ATCS.  From  these  recordings  total  radio  transmission  time  (RTT)  wns  derived 
by  use  of  voice-actuated  relays  and  digital  counters.  Workload  was  also  derived  from  traffic  counts. 
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In  order  to  amalgamate  a  large  volume  of  biochemical  data,  a  streaa  Index  (Cs)  was  formulated.  The 
details  of  the  Index  have  been  published  (2).  Briefly,  the  -odex  Is  based  on  the  Idea  that  the  product 
of  the  resting  and  working  values  for  each  SIH  gives  a  more  realistic  view  of  stress  than  does  the 
excretion  Increment  (or  decrement)  from  rest  to  work.  However,  because  Che  SIHs  appear  in  such  unequal 
quantities  In  the  urine,  each  individual  mean  that  value  (rest  and  work)  is  adjusted  by  dividing  It  by 
a  grand  mean  derived  for  that  SIH  from  all  the  measurements  made  on  ATCSs  In  all  past  studies  in  this 
laboratory.  This  adjustment  causes  all  SIHs  to  assume  equal  Importance  In  the  calculation  of  C8.  Cs  is 
the  average  of  Indexes  calculated  for  each  of  the  SIHs,  c8t  (17-KGS),  c„  (E) ,  and  Cne  (NE).  This  Index 
allows  different  controllers  and  facilities  to  be  readily  compared. 

The  indexes  for  each  of  the  SIHs  can  be  presented  diagrammatlcally  to  show  composite  stress  and  the 
relative  contributions  of  each  SIH  thereto.  The  diagram  is  based  on  the  theorem  that  the  sum  of  the 
lengths  of  internal  lines  emanating  from  a  common  point  and  perpendicular  to  the  sides  of  an  equilateral 
triangle  la  equal  to  the  altitude  of  the  triangle  (3,4).  The  values  for  cst>  ce>  and  cne  can  be  repres¬ 
ented  as  lines  originating  at  a  common  point  and  diverging  at  angles  of  120s,  the  lengths  of  which  are 
proportional  to  the  values  of  c8t*  ce,  and  cn«.  Lines  drawn  perpendicular  to  the  free  ends  of  the 
diverging  lines  form  an  equilateral  triangle,  the  area  of  which  Is  proportional  to  Cg,  the  average  of 
Cst»  Ce»  and  cn8. 


III.  Results.  Field  Experiments.  Table  1  shows  the  correlation  between  Cg,  cat ,  ce,  cne,  and  RTT  at 
Opa  Locks  (OPF)  Air  Traffic  Control  Tower  (ATCT)  located  on  a  very  busy  general  aviation  airport  in 
Greater  Miami,  Florida.  It  is  apparent  that  RTT  Is  significantly  related  to  ee  and,  with  less  signifi¬ 
cance,  to  cne.  RTT  Is  not  significantly  related  to  cst- 

Figures  14  show  graphically  the  relationship  between  stress  Indexes  and  RTT  at  OPF. 

Between  1972  and  1974,  Los  angeles  (LAX)  and  Bay  Area  (Oakland  (OAK))  Terminal  Radar  Approach 
Control  (TRACON)  facilities  were  given  automated  radar  terminal  system  equipment  (ARTS-1II).  This  equip¬ 
ment  displays  aircraft  identification  am.  altitude  on  the  radar  cathode  ray  tube,  and  thus  contributes 
greatly  to  safety.  This  equipment  was  also  expected  to  reduce  significantly  the  workload  of  radar 
controllers.  There  were  other  changes  associated  with  the  new  equipment,  also.  The  TRACONs  were  moved 
from  their  towers  to  separate  buildings  with  adjacent  parking  lots;  the  dress  code  was  relaxed;  lounge 
facilities  and  the  general  work  environment  In  the  control  rooms  were  greatly  Improved. 

8tudles  were  carried  out  at  these  two  TRACONs  prior  to  (1972)  and  after  (1974)  installation  of 
ARTS-XI7.  Table  2  shows  changes  in  stress  Indexes  for  Individual  controllers  before  and  after  ARTS-III 
installation.  It  Is  ctear  that  there  was  a  uniform  drop  In  17KGS  and  an  increase  In  catecholamine 
excretion.  The  workload  in  terms  of  number  of  aircraft  worked  and  number  of  radio  contacts  Is  shown  In 
Table  3. 

The  traffic  count  at  LAX  increased  by  3  percent  and  at  OAK  by  4  percent  from  1972  to  1974.  The 
number  of  radio  <ntacts  Increased  by  3  percent  at  LAX  and  by  1  percent  at  OAK,  while  composite  stress 
increased  by  2j  percent  at  LAX  and  20  petcent  at  OAK.  This  increase  can  be  seen  dlagrennatlcally  In 
Figure  3.  Thus,  the  disproportionate  Increase  in  stress,  entirely  due  to  elevated  catecholamine  excre¬ 
tion,  Is  not  explained  by  the  objective  workload.  The  explanation  most  likely  lies  in  work  elements  not 
reflected  In  traffic  counts,  RTT  or  number  of  radio  contacts.  The  new  TRACONs  hsd  been  In  use  only 
about  3  months  at  the  times  of  the  second  studies.  There  were  Btlll  equipment  difficulties  to  be 
worked  out;  outsges  were  fairly  frequent.  Controllers  liked  the  reduction  In  coordination  with  other 
facilities  nnd  within  the  TRACONs;  however,  the  concensus  smong  the  controllers  was  that  ARTS-III  had 
not  reduced  and.  In  fact,  had  Increased  the  total  workload,  primarily  because  of  unfamillarlty  with  the 
nav  equipment. 

Catecholamine  Excretion  and  Traffic  Count.  Because  of  the  demonstrated  relationship  between  E 
excretion  end  RTT,  annual  traffic  counts  for  ATCTs  where  studies  have  been  conducted  were  graphed 
against  mean  E  excretion  for  controllers  at  those  towers.  Such  a  graph  is  shown  In  Figure  6  where 
annual  traffic  count  la  plotted  against  mean  working  and  resting  E  excretion  for  ATCSs  at  ATCTs  ranging 
from  low  to  high  density.  The  relationship  between  traffic  count  and  mean  E  excretion  Is  significant 
(R  ■  0.96),  The  working  value  for  O'Hare  ATCSs  has  been  displaced  tc  the  left  to  reflect  the  fact  that 
ORD  ATCT  was  effectively  operated  us  two  facilities  with  separate  control  positions  for  the  north  and 
south  sides  of  the  airport;  one  aide  woa  customarily  used  for  departures  and  the  other  side  for  arrivals. 
The  workload  Impinging  on  each  ATCS  was  thus  about  half  of  the  total  airport  traffic.  When  the  data 
point  is  roved  to  reflect  this  division  of  work.  It  falls  near  the  line  of  best  fit. 

Laboratory  Experiments.  Because  realliit  stress  arises  from  a  mixture  of  stressors,  the  physio¬ 
logical  responses  to  thoBe  stressors  are  difficult  to  Interpret.  One  cannot  separate  off-duty  experi¬ 
ences  from  workrelated  factors.  Therefore,  an  attempt  was  made  to  expose  paid  experimental  subjects  to 
"pure"  stressors  In  the  laboratory  in  order  to  delineate  the  specificity  of  the  hormonal  response, 
should  there  be  such. 

Thu  subjects  (10  young  men)  were  each  exposed  to  a  purely  physical  task  with  no  competitive  element 
(treadmill,  3  miles  per  hour  with  no  grade)  and,  on  another  date,  to  a  purely  competitive  but  nonphysical 
task  ("Pong,"  a  video  game  based  on  pingpong) .  One  of  the  researchers  acted  as  opponent  for  all  subjects; 
she  was  an  expert  and  was  rarely  beaten.  Order  of  presentation  of  the  tasks  was  balanced;  each  task 
was  presented  in  50-oln  episodes.  In  the  10-  min  following  each  episode,  urine  collections  were  made, 
rest  was  allowed  and  water  was  imbibed  to  replace  the  urinary  loss.  Urine  was  analyzed  for  17KGS,  E, 
and  NE.  Values  are  expressed  as  the  total  quantity  of  each  SIH  excreted  during  each  50-min  episode. 

The  schedule  in  aach  Instance  vas  maintained  for  3  h 

Prior  to  either  experimental  exposure,  each  subject  rested  for  50-min  in  the  supine  position  on  a 
cot,  iX'C  electrodes  having  been  previously  attached  for  registration  of  ambulatory  heartrate  on  small 
battery-operated  ECG  tape  recorders. 
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The  reaults  of  urine  and  heart  rate  analyses  are  shown  in  Tables  4  and  5.  Corresponding  episodes 
of  the  two  tests  did  not  cause  significantly  different  excretion  levels  in  urinary  metabolites.  Heart- 
rates  were  significantly  higher  for  treadmill  than  for  Pong  tasks.  Rest-to-vork  difference  in  E  excre¬ 
tion  was  statistically  significant  for  the  Pong  task,  whereas  the  difference  was  not  significant  for  the 
treadmill  task.  Rest-to-woik  differences  in  excretion  of  HE  and  17 -KGS  were  not  atatistlcally  significant 
for  either  task. 

IV.  Discussloa.  Data  collected  over  10  years  from  several  ATC  facilities  point  tc  catecholamines, 
principally  E,  as  being  a  good  Indicator  of  the  responae  to  an  applied  workload.  The  adrenal  cortical 
response  (17-KGS)  primarily  Indicates  chronic  stress  arising  from  unresolved  conflicts  such  as  labor 
management  disputes,  marital  difficulties,  financial  problems,  etc.  NE  and  E  usually  go  in  the  same 
direction — seldom  does  one  go  up  and  the  other  down.  It  is  always  difficult  to  separate  the  so-called 

physical  and  mental  stressors  in  a  field  study  setting.  Our  laboratory  studies  indicate  strongly  that 

mental  activity  without  significant  physical  effort  engenders  a  significant  output  of  E  above  r.he  resting 
state  that  doec  not  occur  during  episodes  of  purely  physical  effort. 

It  thus  appears  that  there  is  a  degree  of  3tressor-response  specificity.  The  large  unanswered 
question  relates  to  the  significance  of  the  magnitude  of  the  response.  When  is  a  person  underloaded, 

optimally  loaded  and  overloaded?  It  is  also  clear  that  ATCSs  at  ATCTs  with  low  traffic  density  have  a 

low  E  output  while  ATCSs  at  high  density  ATCTs  have  a  relatively  greater  E  output.  Obviously,  each  group 
of  ATCSs  Is  doing  at  least  an  adequate  Job  and  one  cannot  say  on  that  basis  whether  overload,  optimal 
load,  or  underload  is  present.  However,  it  is  distinctly  possible,  in  view  of  the  known  effects  of 
catecholamines  on  the  cardiovascular  system,  that  the  cost  of  adequate  performance  is  greater  for  the 
ATCSs  at  high  density  facilities  than  it  is  for  ATCSs  at  low  density  ATCTs.  Rose  (S)  has  recently  shown, 
as  have  others  in  the  past  (6) ,  that  hypertension  is  more  prevalent  among  ATCSs  than  among  the  general 
population.  We  have  shown  that  E  excretion  level  Is  significantly  and  directly  related  to  heartrate  (7). 
Further,  we  have  shown  in  a  limited  group  of  ATCSs  that  elevated  NE  excretion  is  predictive  of  later 
hypertension  (8).  These  data  suggest  that  the  cost  of  adequate  performance  at  a  high  traffic  density 
ATC  facility  may  result  in  breakdown  of  physiological  systems. 

At  the  other  end  of  the  workload  spectrum  it  is  obvious  that  adequate  performance  can  be  maintained 
without  great  arousal  brought  on  by  high  blood  levels  of  catecholamines.  In  short,  it  appears  probable 
that  arousal  necessary  to  meet  workload  demand  is  mediated  by  sympathoadrenal  output  of  catecholamines. 

This  idea  was  first  put  forward  by  Cannon  in  1929  in  his  description  of  the  "fight  or  flight"  reaction 
(9)  and  is  applicable  to  the  ATC  task. 

These  data  also  are  consistent  with  the  data  reported  by  Schaad,  Gllgen,  and  Grandjean  who  showed 
a  statistically  significant  relationship  between  urinary  catecholamine  excretion  and  level  of  difficulty 
of  work  in  European  air  traffic  controllers  (10). 
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TABLE  1.  Correlation  Coefficients  of  Stress  Indexes  and  RTT  at  OPP 


INDEX  RTT 

Cg  0.64* 

c  „  0.19 

st 

c  0.77** 

e 

c  0.55* 

ne 

*  P  <  0.05 
**  P  <  0.01 

TABLE  2.  Stress  Indices  Frau  Individual  Controllers 
Before  and  After  ARTS-III  Installation 


Cs 

C8t 

ce 

cne 

LAX 

1 

2 

1  2 

1 

2 

1  2 

Subject 


1 

0.90 

1.25 

0.96 

0.34 

0.48 

1.80 

1.25 

1.61 

2 

0.42 

0.37 

0.64 

0.20 

0.20 

0.50 

0.42 

0.41 

3 

0.30 

0.36 

0.48 

0.32 

0.09 

0.18 

0.32 

0.57 

4 

0.48 

1.28 

0.33 

0.21 

0.38 

1.11 

0.72 

2.51 

5 

0.62 

0.47 

0.55 

0.14 

0.43 

0.67 

0.39 

0.60 

6 

0.52 

0.47 

0.68 

0.34 

0  20 

0.42 

0.69 

0.65 

7 

0.42 

1.42 

0.59 

0.33 

0,79 

0.59 

0.38 

3.42 

OAK 

1 

2 

1 

2 

1 

2 

1 

2 

Subject 

1 

0.45 

0.68 

0.58 

0.51 

0.53 

0.99 

0.24 

0.54 

2 

0.73 

0.72 

0.50 

0.27 

1.21 

1.17 

0.50 

0.72 

3 

0.57 

0.47 

0.56 

0.21 

0.49 

0.59 

0.67 

0.60 

4 

0.35 

0.71 

0.31 

0.29 

0.29 

1.56 

0.44 

0.28 

5 

0.46 

0.69 

0.90 

0.37 

0.23 

1.06 

0.25 

0.63 

6 

0.42 

0.86 

0.59 

0.43 

0.24 

1.27 

0.43 

0.88 

7 

0.27 

1.36 

0.20 

0.21 

0.28 

2.83 

0.34 

1.03 

8 

0.18 

0.57 

1.17 

0.05 

0.15 

1.01 

0.22 

0.65 

9 

0.40 

0.84 

0.73 

0.15 

0.28 

1.81 

0.20 

0.56 

10 

C.33 

0.47 

0.61 

0.16 

0.22 

1.03 

0.17 

0.22 

11 

0.22 

0.74 

0.31 

0.10 

0.16 

1.33 

0.20 

0,79 

TABLE  3.  Humber  of  Aircraft  Worked  and  Number  of  Radio  Contacts  In 
s  5-Day  Workweek. 


FACILITY 

NO.  AIRCRAFT 

NO.  CONTACTS 

CONTACTS /AIRCRAFT 

LAX  (1972) 

1,803 

13,806 

7.66 

LAX  (1974) 

1,860 

14,210 

7.64 

OAK  (1972) 

1,190 

8,712 

7.32 

OAK  (1974) 

1,233 

8,827 

7.13 

TABLE  4.  Comparison  of  Excretion  Values  and  Heartrates  for  Pong  and  Treadmill  Tasks* 


Task 

Total 

17-KGS 

mg 

Amounts  of  Hormones 
:  E 

ng 

Excreted 

NE 

ng 

Hcartrate 
(Beats  Per  Minute) 

Rest  (Pong) 

0.70 

1,237 

3,603 

64 

Rest  (T-Mlll) 

0.67 

1,214 

4,274 

64 

P 

NS** 

NS 

NS 

NS 

Pong  1 

0.70 

1,619 

3,809 

73 

T-Mlll  1 

0.59 

1,741 

4,384 

101 

P 

NS 

NS 

NS 

0.05 

Pong  2 

0.62 

1,720 

3,379 

73 

T-Mill  2 

0.67 

1,463 

3,813 

100 

P 

NS 

NS 

NS 

0.05 

Pong  3 

0.59 

1,750 

3,833 

70 

T-Mlll  3 

0.5B 

1,491 

3,581 

98 

P 

NS 

NS 

NS 

0.01 

*  Croup  Averages 
**  T-teat 


TABLE  5.  Statistical  Significance  of  Reat-To-Work  Differences  for  the  Various  Measurements* 

Level  of  Significance  of  Difference  Between  Rest  and  Task  (P**) 


TASK 

17-KGS 

E 

NE 

HEARTKATE 

Pong  1 

NS 

0.01 

NS 

NS 

Pong  2 

NS 

0.01 

NS 

NS 

Pong  3 

NS 

0.05 

NS 

NS 

T-Mill  1 

NS 

NS 

NS 

0.01 

T-Mlll  2 

NS 

NS 

NS 

0.01 

T-Mlll  3 

NS 

NS 

NS 

0.01 

*  See  Table  4  fur  actual  value* 

**  Paired  t-t#*<c. 
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8TRENS  TRIANGLE  REPRESENTATION  OF  CHANQES 
OCCUKRINO  IN  Om,  e„  AND  c„,  AFTER  INSTALLATION 
OF  ARTS  XU  AT  OAKLANO  AND  LOS  ANSELES  TRACONS 


-A-  A 


OAK  ('72 ) 


Ca>  0.60 


LAX  ('72) 
0.60 


OAK  {'74) 
0.72 


A  A 


LAX  ('74) 
0.75 


FIGURE  5.  Diagrammatic  representation  of  the  relationship  between  cgt,  ce, 
and  cne  on  Streng's  triangle.  Comparison  of  LAX  and  OAK  TRACONs 
before  and  after  Installation  of  ARTS-lII. 


FIGURE  6.  Graph  of  annual  traffic  count  (in  millions  of  operations)  vs.  mean 
urinary  excretion  levels  of  E  of  controllers  at  the  various 
facilities.  Crosses  represent  on-duty  excretion  levels  of  E; 
circles  represent  corresponding  resting  levels  (ORD  graphed  at 
actual  traffic  count  and  adjusted  value  (+)  as  explained  in  the  text). 
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INTRODUCTION 

A  few  years  ago  Dean  Chiles  (1),  while  discussing  objective  methods  for  developing  indices  of  pilot 
workload  talked  about  a  hypothetical  research  vehicle  having  a  system  with  the  following  capabilities: 

(1)  it  would  have  an  exact  assignment  of  the  nature  and  number  of  pilot  duties  that  could  be  developed 
for  any  given  mission;  (2)  It  would  be  possible  to  vary  those  duties  ^n  any  combination  over  time;  (3) 
the  control  and  display  characteristics  of  the  vehicle  could  be  manipulated  at  will;  (4)  precise  and 
reliable  quantitative  Indices  of  the  task  demands  placed  on  the  pilot  by  the  system  would  be  available 
for  all  task  elements;  (5)  precise  and  reliable  quantitative  measures  of  the  skill  with  which  the  pilot 
meets  these  demands  would  be  available;  and  (6)  an  adequate  criterion  measure  of  total  system  performance, 
would  be  available.  If  we  had  such  a  hypothetical  vehlcl  .  It  Is  obvious  that  we  would  be  able  to  deter¬ 
mine  the  priorities  that  a  pilot  assigns  certain  tasks.  We  would  also  be  able  to  assess  the  attention 
demands  on  the  pilot  by  the  system,  and  we  would  be  able  to  determine  which  tasks  or  performance  functions 
are  most  sensitive  to  variations  and  total  demand,  we  might  be  able  to  solve  some  of  the  problems  we  have 
experienced  in  terms  of  the  human  being  acting  at  certain  optimal  work  rates,  Ignoring  certain  signals 
from  his  display,  working  at  how  own  pace,  falling  to  pay  attention  to  certain  instruments  or  signals,  but 
in  general,  always  managing  to  perform  at  a  satisfactory  enough  level  to  complete  the  mission.  Perhaps 
a  hypothetical  vehicle  being  a  realistic  flying  system  would  also  help  to  weed  out  the  variable  caused 
by  non-flying  laboratory  systems  where  the  subject  allowed  to  crash  the  system  because  of  decremented 
performance.  Whereas  in  'real  world"  systems  we  find  that  the  pilot  works  harder  and  harder;  performing 
more  and  more  control  responses  and  movements,  but  the  end  result  is  usually  to  make  a  landing  that  he 
can  walk  away  from. 

Chiles  also  pointed  out  that  the  first  and  foremost  factor  to  keep  in  mind  in  choosing  a  methodology 
for  assessment  is  the  purpose  or  goal  of  the  research.  Unfortunately,  the  entire  history  of  assessing 
workload,  performance  or  stress  in  the  human  operator  is  one  of  compromise.  We  have  to  compromise  because 
of  safety,  because  of  operational  requirements;  we  have  to  devise  laboratory-type  tasks  because  the  real 
thing  is  not  available  or  la  unssallable  to  our  measures  and  often  times  we  must  rely  upon  human  beings 
other  than  pilots  to  perform  these  tasks  because  of  the  demands  upon  pilotage  time  in  the  real  systems 
world . 

The  assessment  correlates  of  workload,  performance,  and  stress  cun  be  divided  into  several  areas: 
those  of  physiological  correlates,  psychological  correlates,  Btress  correlates,  psychophysiologic 
correlates  and  finally  central  nervous  system  (CNS)  correlates.  ',<e  realize  that  this  is  an  artificial 
taxonomy  and  that  many  areas  of  overlap  exist;  however,  we  thought  we  would  arbitrarly  subsume  under  the 
heading  psychologic  correlates  those  tasks  of  vigilance,  monitoring,  tracking,  reaction  time,  and  so 
forth.  It  should  be  noted  that  we're  talking  about  the  operational  aspects  of  psychologic  correlates, 
that  many  of  the  psychological  tests  have  been  used  as  selection  devices  or  as  measures  of  skill  in  order 
to  predict  successful  training  as  a  pilot  or  aircrew  member.  These  tests,  for  the  most  part,  have  shown 
little  relationship  to  the  prediction  of  workload  and  performance  abilities.  Many  of  these  tasks  have  had 
the  disadvantage  of  single  operators  looking  at  single  displays  and  measuring  single  scores  which  are  often 
poorly  related  to  the  real  world  of  the  man-machine  interface.  Some  of  these  measures  are  difficult,  if 
not  impossible  to  measure  in  the  operational  environment.  However,  we  should  keep  in  mind  that  it  may 
not  always  be  necessary  to  measure  everything  in  the  operational  environment,  that  is,  while  the  aircrew 
member  is  piloting  or  doing  his  thing.  It  may  be  possible  to  ascertain  whether  he  is  capable  of  initiating 
this  particular  flight-mission  and  it  may  be  possible  to  ascertain  upon  his  return  from  a  mission  the 
amount  of  decrement  that  resulted  by  using  relatively  simple  type,  fiel  '  type  tasks.  Very  often  in  pursuit 
of  psychologic  variables  we  have  resorted  to  the  use  of  subjective  evaluation  of  such  factors  as  fatigue, 
stress,  irritation,  etc.  Unfortunately,  we  are  finding  Increasing  evidence  that  certain  aspects  of  these 
factors  may  not  be  amendable  to  accurate  subjective  evaluation. 

Some  20  years  ago  this  author  in  his  thesis  study  which  had  to  do  with  the  effect  of  binaural  beats 
upon  performance  (2)  found  out  there  was  a  subjectively  experienced  quality  of  beats  produced  externally 
in  an  audio  mixer  aB  compared  with  those  auditory  beats  generated  centrally  in  the  central  nervous  system. 
These  two  kinds  of  auditory  beat  phenomenons  were  preceived  ao  differentially  disruptive  by  the  subjects. 

In  one  experiment  the  subjects  said  they  felt  that  the  stimulus  in  neither  session  bothered  them  in  any 
way  especially  in  terms  of  their  performance  on  the  tests  that  they  were  required  to  do.  They  ignored 
the  fact  that  they  were  not  doing  any  better  in  consecutive  performances  and,  in  fact  that  they  were 
actually  doing  somewhat  poorer.  In  another  experiment,  the  subjects  invariably  reported  that  the  sound 
of  an  externally  produced  beat  really  bothered  them  and  that  they  were  sure  that  their  performance  was 
effected.  This  was  in  spite  of  the  fact  that  they  could  observe  that  they  were  completing  more  items 
and  doing  better  than  they  had  on  previous  sessions.  We  were  forced  to  conclude  that  the  neuralmechanlsm 
by  which  binaural  beats  influenced  performance  is  not  open  to  correct  subjective  evaluation. 

In  terms  of  physiologic  correlates  of  workload  and  performance  we  are  considering  the  electromyogram, 
the  electrocardiogram,  the  measurement  of  various  metabolites  in  the  parotid  fluid  and  urinary  tract,  etc. 
Unfortunately,  these  physiologic  correlates,  while  telling  us  that  a  human  being  has  been  stressed,  do 
not  tell  when  in  the  course  of  time  the  stress  occurred,  or  what  was  the  nature  of  the  stress.  They  simply 
tell  us  end  result  of  workloads  and  performance  which  alter  the  body's  physiology  in  such  a  manner  that 
the  effects  can  be  measured.  Nevertheless,  some  of  these  measures  do  yield  interesting  correlates  of 
performance. 
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Stress  correlates  of  workload  and  performance  are  somewhat  more  anblglous  since  they  bridge  both 
the  psychologic  and  physiologic  factors.  Perhaps,  It  would  be  better  to  consider  terms  such  as  environ¬ 
mental  and  operational  stresses.  We  have  to  consider  the  problem  of  the  acuteness  of  the  stress  versus 
chronic  stress  versus  the  cumulative  effects  of  stress.  Perhaps,  we  will  be  able  to  explore  the  usefulness 
of  the  concept  of  task-induced  stress,  where  the  stress  lies  clearly  within  the  task  in  such  a  way  as  to 
present  the  human  operator  with  a  unique  situation. 

Some  of  the  psychophysiologlc  correlates  that  we  will  consider  are  the  critical-flicker-fusion  rate, 
the  psychogalvanic  skin  response  and  electro-oculography.  These  and  other  measures  can  be  useful  In  terms 
of  revealing  parameters  of  central  nervous  system  function.  We  will  elaborate  considerably  In  this  area 
because  it  leads  us  to  the  Important  concept  of  central  nervous  system  correlates  of  workload  and  perfor¬ 
mance.  Nevertheless,  psychophysiologlc  correlates  are  difficult  to  relate  to  actual  performance  or  to 
workload  effects  because  they  may  reflect  subjective  evaluation  of  task  difficulties  and  they  may  be 
related  to  subjectively  calculated  probability  measures.  Nevertheless,  if  one  can  tease  out  theoe 
factors  one  Is  left  with  some  correlates  that  may  reflect  the  activity  of  the  central  nervous  system. 

In  spite  of  the  various  problems  of  cortelates  of  assessment,  we  will  try  to  explore  some  of  the 

various  correlates,  both  old  and  new,  which  usy  offer  some  help  in  the  quest  for  measures  and  asaessment 

of  human  workload  and  performance.  No  attempt  is  going  to  be  made  to  make  this  a  global  overview.  Rather, 
we  have  been  highly  selective  in  eliminating  many  measures  which  appear  to  offer  no  fruitful  results  for 
the  amount  of  effort:  expended.  We  have  carefully  eliminated  any  measures  which  can  be  regarded  as 
selection  tools  or  measurements  of  ability,  skills  and  so  forth  which  have  little  to  do  with  the  ultimate 
question  of  trying  to  evaluate  or  predict  human  performance  in  the  operational  environment. 

Psychologic  Correlates :  The  tasks  of  vigilance,  monitoring  and  tracking  seem  to  have  a  common  root.  Most 
of  them  involve  relatively  long  times  at  the  task;  they  involve  the  detection  of  a  signal  or  the  detection 
of  a  nonsignal  or  nonoccurrence  and  some  form  of  motor  response.  Usually,  the  tasks  are  of  a  simple 
nature,  although  they  may  be  made  more  and  more  complex  in  terms  of  additional  targets,  etc.  Such  tasks 

also  may  be  used  to  evaluate  the  effects  of  other  kinds  of  stimuli  on  vigibance,  monitoring  or  tracking 

type  performance. 

In  general,  we  have  found  that  this  type  of  task  when  considered  individually,  that  is,  one  task 
performed  by  one  operator,  in  one  session  differ  considerably  when  they  are  embedded  In  a  multiple  task- 
type  simulator  or  a  multiple  task  paradigm.  The  vigilance  monitoring  or  tracking  task  is  felt  to  measure 
alertness  and  provides  for  minimal  requirements  for  intellectual  and  neuromuscular  function.  A  typical 
such  task  is  described  in  the  Neptune  system  (3)  wherein  the  display  consists  of  three  meters  with  a  zero 
centered  needle  which  deflects  either  left  or  right  and  six  push  buttons,  two  for  each  meter.  The  oubject 
monitors  the  meters  until  the  needle  deflects  then  he  pushes  the  correct  button  and  the  needle  returns 
to  the  center  position.  The  measure  of  this  performance  is  response  time.  Programming  of  the  signals 
is  aperiodic.  This  is  a  simple,  very  undemanding  task  element,  but  the  behavior  it  measures  is  considered 
important  at  low  levels  of  arousal.  Trumbo  (4)  points  out  an  experiment  in  which  they  had  subjects  track 
step-function  sequencer  with  six  possible  target  positions.  From  each  position  there  were  two  alternative 
steps,  unequally  pruiiaMe  and  either  in  the  same  or  opposite  directions.  The  subjects  had  to  anticipate 
to  minimize  error,  therefore,  each  step  presented  them  with  either  an  amplitude  or  a  direction  prediction 
problem.  The  outcome  scores  which  they  obtained  showed  a  relationship  between  input  uncertainty  and 
tracking  error.  Evidence  for  response  strategies  or  organization  came  only  from  continuous  records.  These 
records  revealed  that  subjects  clearly  used  different  strategies  in  the  two  prediction  situations,  matching 
event  probabilities  ana  predicting  direction,  but  averaging  probabilities  in  predicting  amplitude.  The 
importance  of  this  finding  is  that  not  only  were  these  strategies  unavailable  in  the  outcome  scores  which 
are  usually  measured  in  terms  of  response  time,  but  the  averaging  strategy  could  not  have  been  identified 
if  subjects  had  been  limited  to  discrete  response  alternatives  rather  than  a  continuously  graded  response. 

In  the  menu,  remen  t  of  vigilance,  monitoring,  and  tracking  activities,  usually  we  have  found  a  number 
of  hybrid  tasks  put  together  to  evaluate  a  particular  problem.  However,  in  general  we  find  that  a 
nonmechanical  electronically  driven  system  such  as  an  oscilloscope  provides  a  display  with  the  most 
advantages.  It  is  accessible  to  automated  scoring  end  is  readily  adaptable  to  pursuit  or  compensatory 
displays  and  to  one  or  two  dimensional  courses.  Ideally,  the  response  or  control  apparatus  should  permit 
the  operator  to  produce  a  continuum  of  graded  responses,  especially  If  one  la  interested  in  evaluating 
the  aspect  of  motor  skills  along  with  the  vigilance  or  monitoring  activity.  There  are  many  off-the-shelf 
function  genet ators  to  provide  a  programming  system,  but  the  use  of  a  analog  computer  system  would 
certainly  be  advisable  if  one  is  seriously  interested  in  the  determination  of  probabilities  and  the  ability 
to  vary  stimulus  organization  along  many  dimensions.  Most  investigators  are  not  concerned  with  motor 
skills  for  the  sake  of  Investigating  motor  skills  themselves,  but  have  used  a  motor  type  response  as  a 
convenient  vehicle  for  testing  other  hypothesis,  usually  involving  procedural  variables  which  have  been 
derived  frogeneral  behavioral  learning  theories. 

Trumbo  points  out  that  the  rotary  pursuit  apparatus  is  a  good  case  in  point.  He  states  that  it  is 
certainly  the  best  known  and  most  widely  used  motor  skills  apparatus  available.  It  is,  of  course,  a  motor 
tracking  task  and  it  has  been  used  in  a  host  of  studies  on  distribution  of  practice  and  other  procedural 
variables.  Yet,,  in  turns  of  task  variables  it  is  limited  to  little  more  than  rate  of  turn,  target  size, 
and  stylus  weight.  The  rotary  pursuit  generally  yields  a  single  time-on-target  score.  The  time-on-target 
score  has  definite  limitations  in  that  it  does  not  use  all  of  the  data  in  the  error  distribution.  In 
order  to  make  use  of  the  data  in  the  error  distribution,  it  is  necessary  to  measure  root  mean  square  error 
or  average  error  or  integrated  error.  These  values  are  readily  obtained  with  an  electronic  scoring  system. 

Bahrick,  Fitta,  and  Briggs  (5)  point  out  in  a  1957  paper  that  time  on  target  scores,  which  are  the 
amount  of  time  during  a  particular  trial  that  a  person  is  able  to  remain  within  an  arbitrarially  specified 
region  around  a  target  using  some  kind  of  tracking  device,  presents  some  real  problems  in  turns  of  the 
derivation  of  learning  curves  and  their  meaningfulness  relative  to  the  basic  process  of  vigilance,  tracking 
and  monitoring.  They  advocate  the  use  of  the  root  mean  square  measurement  (RMS)  as  the  best  method  of 
avoiding  difficulties  because  it  simply  substitutes  a  single  function  for  an  unlimited  number  of  functions 
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determined  by  all  poealbl?  target  or  activity  dimensions.  In  general,  response  characteristics  may 
follow  a  continuous  and  normal  distribution,  but  learning  or  practice  results  in  a  diminished  variance 
of  this  distribution;  however,  performance  Is  scored  according  to  an  all  or  none  criterion  of  frequency 
of  occurrence.  This  scoring  practice  accounts  for  the  lack  of  predictability  of  auch  tests  as  the 
steadiness  teat,  the  dotting  test,  tweezer  dexterity  tests,  pegboard  tests,  etc.,  whenever  success  is 
scored  agaiust  an  all  or  none  criterion. 

Kennedy  (6)  reports  on  a  vigilance  task  which  increased  In  difficulty  from  one  to  three  channels. 

When  he  stressed  a  group  performing  both  one  and  three  channel  vigilance  monitoring,  the  stress  group 
performed  better  than  the  nonstress  indicating  a  certain  level  of  arousal  or  an  alerting  response  to 
the  threat  of  shock. 

In  another  study,  Demalo,  et.  al. ,  (7)  reported  comparisons  between  student  and  instructor  pilots 
using  a  visual  scanning  task,  showing  that  instructor  pilots  learned  to  attend  to  critical  features  more 
efficiently  than  do  individuals  with  little  or  no  flight  experience,  this  suggests  the  interesting 
possibilities  of  using  a  variety  of  scanning  tasks  In  the  undergraduate  pilot  training  program  to  facil¬ 
itate  the  more  rapid  development  of  adaptive  scanning  strategies.  Thus,  Instead  of  using  a  scanning  or 
monitoring  type  task  to  evaluate  stress  or  other  such  factors,  we  have  the  use  of  the  task  as  a  training 
device . 

Reaction  rime  has  also  suffered  In  the  total  context  of  suspicion  about  the  ambiguity  of  simple 
performance  treasures.  This  again  relates  to  the  concept  of  performance  versus  effort.  This  can  be  seen 
in  the  form  of  simple  performance  in  two  working  situations  which  are  obviously  different  because  the 
operator  sees  it  as  his  task  to  achieve  a  particular  output  rate  and  he  adjusts  his  effort  accordingly. 

Now  this  effort  can  be  detected  by  various  refinements  of  performance  data,  but  it  remains  true  that 
straight  forward  speed-error  data  will  not  easily  reveal  difference  of  effort.  Using  speed  as  a  measure 
of  complexity  makes  unwarranted  assumptions  about  the  essentially  sequential  rather  than  parallel  nature 
of  humrn  performance  processing.  Nevertheless,  reaction  time  measurement  has  been  an  attractive  area 
for  scientific  research  for  some  time.  Apparently  scientists  have  been  intrigued  with  the  attempt  to 
quantify  the  absolute  speed  to  which  a  human  can  react  to  a  signal.  Perhaps  the  landmark  paper  in  this 
field  was  one  by  F.  C.  Donders  in  1865.  O'Donnell  (8)  reports  that  Donders  developed  a  "subtraction 
method"  of  reaction  time.  Basically,  this  theory  stated  that  the  elements  of  the  reaction  time  response 
were  addictive,  in  that  each  component  of  the  response  begins  immediately  when  the  only,  when  the 
preceedlng  component  has  ended.  Thus,  decision  "is  a  component  of  a  reaction  involving  choice.  When  the 
decision  is  being  made,  nothing  else  is  going  on  and  when  that  process  ends,  the  subjective  immediately 
moves  into  another,  perhaps,  movement  phase,  if  this  and  other  assumptions  are  true  than  decision  time 
could  be  calculated  by  knowing  the  total  reaction  time  and  substracting  from  it  the  time  required  for  all 
other  components  of  the  response."  Using  this  approach,  Donders  distinguished  between  three  types  of 
reactions.  The  "A"  reaction  involved  a  single  response  to  a  single  invariant  stimulus.  This  is  what 
is  now  called  simple  reaction  time.  The  "8"  reaction  involves  two  stimuli  and  two  responses  with  the 
stimulus  response  relationship  always  constant.  This  is  the  most  simple  form  of  choice  reaction  time 
presented.  The  "C"  reaction  also  used  two  stimuli  but  in  this  case  only  one  response  was  used  and  that 
response  was  to  be  given  to  one  of  the  stimuli,  but  not  to  other.  Consequently,  one  could  calculate, 
response  selection  time  by  calculating  "B"  minus  "C"  and  stimulus  categorization  time  by  "C"  minus  "A", 
etc..  In  spite  of  over  100  years  of  use  of  reaction  time  testing  with  various  controversies  we  still  have 
many  measures  that  are  only  Indirectly  related  to  reaction  time  per  se,  but  they  use  reaction  time  as  a 
measure  of  performance.  These  secondary  techniques,  if  we  may  call  them  that,  in  general  differ  from 
the  classic  few  of  reaction  time  where  the  stimulus  is  a  relative  discrete  signal  introduced  by  the 
experimenter  for  the  subject's  discrete  response.  These  secondary  techniques  Involve  self-pacing  by  the 
subjects,  simultaneous  performance  on  a  number  of  tasks  and/or  use  of  a  complete  series  of  reactions  to 
obtain  a  total  score  for  task  completion  rather  than  a  reaction  time  score  per  se.  For  example,  Hartman 
and  I  have  used  reaction  time  component  for  mental  arithmetic  and  tracking,  etc.  Recently,  Wood  (9)  has 
undertaken  a  study  of  the  neurophysiological  basia  of  reaction  time  change  as  a  viable  means  of  exploring 
physiological  mechanisms  of  local  muscular  fatigue  and  fatigue  effects  on  3ensori-motor  performance.  Here, 
he  used  measures  of  reaction  time,  evoked  potential,  and  EMO  as  indicators  of  central  and  peripheral 
activity.  With  this  particular  reaction  time  model  he  is  able  to  fractionate  total  reaction  time  into 
component  latencies,  he  is  able  to  study  central  versus  peripheral  issues  which  are  featured  predominantly 
in  both  fatigue  and  reaction  time.  Wilkinson  (10)  reports  on  a  small  battery-powered  fully  portable 
device  for  administering  a  four  choice  serial  reaction  time  test  and  recording  the  results  on  a  standard 
magnetic  tape  cassette.  In  preliminary  performance  trials,  this  tes*-  appears  to  reflect  fatigue  due  to 
continuous  repetitive  responding  in  a  way  similar  to  classical  nonportable  multiple  choice  serial  reaction 
testa.  Galllavd,  et.  al. ,  (11)  reports  some  effects  of  ACTH  4-10  using  a  serial  reaction  task,  concluding 
that  this  particular  drug  counteracts  the  usual  decay  and  performance  as  a  function  of  time  on  task  due 
to  increasing  boredom  end  mental  fatigue.  Bartz  (12)  describes  an  experiment  using  peripheral  detection 
and  central  task  complexity  where  reaction  time  is  measured  relative  to  the  peripheral  stimulus.  He 
supports  Hebb's  arousal  theory  which  would  predict  that  increasing  the  complexity  of  the  central  task 
would  heighten  the  subjects'  vigilance  performance.  Salzman  and  Jaques  (13)  explored  the  relationship 
between  heart  rate  changes  and  reaction  time.  They  found  no  relationships  between  reaction  time  and  the 
heart  beat  immediately  preceding  the  stimulus  or  with  the  beat  during  which  the  stimulus  was  presented. 
Therefore,  response  latencies  in  terms  of  reaction  time  did  not  differ  significantly  as  a  function  of 
phase  in  the  cardiac  cycle  as  predicted  by  J.  l.acey  who  suggested  that  feedback  from  cardiac  events  can 
effect  central  functioning  by  a  negative  feedback  regulatory  loop  mechanism.  Thackray,  Bailey,  and 
Touchstone  (14)  reporting  on  boredom  and  monotony  while  performing  a  simulated  radar  control  task  showed 
that  a  high  bc.edom/motonony  group  revealed  greater  increases  in  response  times,  heart  rate  variability, 
and  "strain",  and  a  greater  decrease  in  attentiveness.  They  conclude  that  the  pattern  associated  with 
boredom  and  monotony  seems  m  re  closley  related  to  attentional  processes  than  to  arousal.  Holt  and 
Brainard  (15)  reported  an  experiment  using  reaction  time  and  a  condition  of  selective  hyperthermia  where 
they  raised  cortical  temperatures.  In  tis  task,  (a  simple  choice  reaction  time  task)  response  times  and 
response  variabilities  were  decreased  compared  to  performance  in  either  control  or  placebo  condition. 


heart  rata,  ate.,  aa  phyailogical  and  others  such  as  we  have  Juat  discussed,  vigilance,  tracking, 
monitoring,  as  psychological.  However,  as  Singleton  (16)  has  pointed  out,  it  now  begins  to  )ooks  as 
though  the  complexities  and  interactions  within  the  human  body  are  such  that  neither  discipline  la 
adequate  alone  for  the  study  of  any  problem  of  man  at  work.  The  physiologist  can  be  accused  of  too 
narrow  an  approach  with  insufficient  regard  to  cortical  dominance  and  tending  to  deal  with  endocrine  and 
autonomic  parameters.  Similarly,  the  psychologist  can  be  accused  of  treating  tha  human  oparator  aa  too 
"pure"  an  information  processing  device  without  sufficient  regard  for  subcortical  and  sematic  factors 
which  clearly  influence  performance.  Directing  attention  now  to  the  physiological  correlates  we  find 
that  physiological  measures  of  heart  rate,  muscle  physiology,  body  metabolites  such  as  17  keto-Bterolds, 
etc.,  have  baan  used  to  provide  a  method  of  measurement  and  to  provide  a  set  of  standards.  In  terms  of 
a  set  of  standards  derived  from  these  kinds  of  measures  it  must  be  pointed  out  that  it  has  been  difficult 
to  determine  that  a  partciular  task  requires  an  energy  expenditure  of  so  m  my  calories  per  minute  or 
hour,  but  it  is  even  more  difficult  Co  determine  whether  this  energy  expenditure,  or  heart  rate  level,  or 
outpouring  of  metabolite  constitutes  a  light  workload,  a  light  energy  atraaa,  or  a  heavy,  or  even  an 
Intolerable  amount  of  effort  on  the  part  of  the  human.  Anothsr  aspact  of  the  general  difficulty  about 
physiological  measures  is  that  the  stress  on  the  operator  tends  to  have  similar  effects  whether  it  ia  due 
to  work,  development,  fear  or  environmental  factors  such  as  noise,  vibration,  etc.  Nevertheless,  there 
is  some  value  in  using  physiological  concepts  to  attempt  to  predict  the  behavior  of  the  human  operator 
keeping  in  ulnd  that  the  relationship  between  simple  physical  measures  of  the  environment  and  the 
corresponding  effects  upon  the  operator  Invariably  turns  out  to  be  a  multidimensional  problem  with 
dominant  Influences  from  many  variables  difficult  to  measure  or  control  especially  those  in  the  psycho¬ 
logic  realm  of  attitude  and  motivation.  Sharkey,  McDonald  and  Corbrldge  (17)  point  out  in  a  paper  in 
which  they  evaluate  pulse  rate  and  pulmonary  ventilation  as  predictors  of  human  energy  cost  that  this 
human  energy  cost  and  efficiency  are  of  considerable  importance  in  the  evaluation  of  the  equipment  for 
industrial  tasks.  Now  this  is  particularly  true  if  we  look  at  industrial  tasks  and  related  equipment  in 
the  light  of  aircrew  protection,  garments  for  altitude  effects,  thermal  effects,  and  chemical  defense. 
Pulmonary  ventilation  rate  as  a  predictor  of  human  enery  costs  has  long  been  known  and  used;  however,  the 
accurate  assessment  of  ventilation  -ate  depends  on  cumbersome  gas  analysis  techniques  and  still  requires 
that  gas  he  collected  in  the  field  and  transported  to  the  laboratory  where  the  time  spent  In  analysis 
still  restricts  sample  sices  to  those  relatively  small.  Therefore,  this  partlcualr  investigation  attempts 
to  compare  the  precision  of  prediction  of  human  energy  costs  afforded  by  both  pulse  rate  and  ventilation 
rates.  In  short,  in  spite  of  the  attiactlveness  of  using  relatively  simple  determination  and  recording 
of  pulse  rate,  the  use  of  pulse  rate  alone  in  lieu  of  ventilation  rate  would  indicate  the  possibility  of 
larger  errors  in  predicting  energy  cost.  In  spite  of  the  drawbacks  of  pulse  rate  alone,  it  should  be 
pointed  out  that  Sharkey,  et.  al. ,  Indicate  that  predicted  energy  costs  were  over-estimated  rather  than 
under-estimated  (17). 

In  another  study  relating  to  task  and  load  difficulties  using  the  EKG  by  Schwarz  and  Ekkers  (18), 
they  diacust,  the  task  of  developing  the  optimal  functioning  reliability  of  a  complex  system  from  three 
aspects  (1)  the  development  of  analysis  of  the  reliability  of  the  system  (2)  the  organizational  rules 
of  procedure  by  which  unanticipated  emergencies  can  be  forestalled  and  (1)  equipping  the  individual 
operator  physically  and  mentally  to  regulate  tasks  and  load  difficulties.  In  this  study,  they  found  that 
EKG  was  significantly  related  to  the  perceived  gravity  of  an  unannounced  or  emergency  situation.  In 
another  study  relating  task  demand  reflected  in  physiological  variables,  Frakenhaeuser  and  Johausson  (19) 
measured  catecholmine  excretion  and  heart  rate  variance  pointing  out  that  the  physiological  arousal 
Indices  were  more  susceptible  than  performance  measures  to  the  level  of  task  demands.  In  other  words, 
the  higher  demand  imposed  by  a  double  conflict  task  was  reflected  in  relative.1  y  larger  increases  of 
adrenalin  excretion  and  heart  rate  where  as  performance  measures  which  were  psychological  remained 
unaffected.  A  study  by  George  Montomgery  (20)  on  the  effects  of  performance  evaluation  and  anxiety  on 
cardiac  response  in  anticipation  of  a  difficult  problem  solving  task  showed  that  analysis  of  second-by- 
second  changes  in  cardiac  rate  revealed  that  waveform  components  were  sensitive  to  both  anxiety  and 
failure  within  the  evaluation  stress  condition  only.  Initial  cardiac  acceleration  responses  covaried 
with  performance  measures  across  anxiety  ,jrc  *ps  apparently  reflecting  differences  in  confidence  or  moti¬ 
vation.  Concept  of  the  anticipated  problem  solving  task  was  reflected  in  a  cardiac  foreperiod  deceleration 
response  which  is  very  likely  related  to  ittentional  readiness  for  the  beginning  of  the  problem. 

In  a  relatively  long-term  study  of  the  activity  of  the  nervous  system  during  pilotage  activities  of 
letdown,  approach,  and  landing  Nicholson  aid  his  colleagues  (21)  have  related  pilot  subjective  assessment 
of  his  workload  to  changes  in  heart  rate  using  the  RR  interval  and  the  finger  tremor  measured  by  an 
accelerameter .  They  have  concluded  that  the  mean  heart  rate  Interval  around  touchdown  reflects  the  work¬ 
load  of  the  crew's  letdown,  approach,  and  landing  phases  whereas  changes  in  finger  tremor  are  associated 
with  untoward  events  during  the  approach  which  relate  to  difficulties  in  the  dynamic  flight  situation 
involving  weather,  wind  shear  and  other  factors.  A  follow-on  study  of  four  years  of  workload  assessment 
was  done  to  determine  how  effective  their  measures  were  in  terms  of  reliability.  They  report  that  the 
subjective  assessments  of  the  pilot  are  meaningful.  However,  they  note  that  the.  degree  of  neurological 
changes  associated  with  the  possibility  of  Impaired  subjective  analysis  of  workload  may  be  related  to 
the  fact  that  under  difficult  circumstances  a  pilot  may  have  a  degree  of  central  nervous  system  arousal 
above  that  which  may  be  associated  with  optimum  performance.  Their  finger  tremor  technique  is  interesting 
because  it  may  be  related  to  the  release  of  catecholmines  which  are  in  turn  associated  with  finger  tremor. 
On  the  other  hand,  ballisto-cardiographic  effects  may  play  a  role  in  this  mechanism,  since  finger  tremor 
has  been  observed  with  muscular  contraction  during  pronounced  tachycardia.  As  they  conclude,  "the 
peripheral  changes  in  nervous  activity  observed  during  the  letdown,  approach,  and  landing  may  indicate 
two  physiological  states  both  of  which  arise  from  central  nervous  arousal.  In  the  case  of  high  workli  ad 
letdowns  without  untoward  events,  profound  cardiac  acceleration  and  limited  finger  tremor  are  the  physio¬ 
logical  changes  of  neurogenic  origin.  In  letdowns  in  which  the  approach  is  complicated,  profound  finger 
tremor  dominates  the  picture  and  may  be  associated  with  circulating  caterholmines. " 

In  another  study  relating  physiological  correlates  to  changes  during  a  mental  task,  Kahneman,  Tursky, 
Shairo  and  Crider  (22)  had  subjects  perform  a  paced  mental  task  at  three  levels  of  difficulty  while  they 
recorded  pupil  diameter,  heart-rate,  and  skin  resistance  changes ,  They  reported  a  similar  pattern  of 
sympathetic-like  increase  found  in  the  three  autonomic  functions  during  performance  intake  and  processing 


followed  by  decrease  during  tne  report  phase.  The  peak  response  of  each  of  these  three  measures  was 
ordered  as  a  function  of  task  difficulty.  There  Is  considerable  evidence  that  problem  solving  performance 
as  well  as  other  tasks  are  associated  with  activation  of  the  sympathetic  nervous  system  Indicated  bv 
Increased  electro-dermal  activity!  Increased  heart  ratet  Increased  blood  pressure  and  peripheral  vasocon¬ 
striction. 

It  has  been  long  known  that  pupil  dilation  occurs  during  mental  activity.  More  recent  research  bv 
Kahneman  and  Beatty  (23)  suggests  that  this  Indicator  may  be  portlcularily  sensitive  to  mental  activity 
in  a  special  way.  While  It  Is  true  that  pupillary  changes  ar  associated  with  actlvltation  of  the 
sympathetic  nervous  system  an  even  more  important  index  of  arousal  Is  the  fact  that  the  oculomotor  nerves 
which  act  to  change  pupillary  siz?  originate  In  the  ascending  reticular  formation  and  provide  us  with  an 
important  window  Into  that  system.  It  is  unfortunate  indeed  that  pupillary  measures  are  so  difficult  to 
obtain  In  terms  of  equipment  and  the  physical  constraints  imposed.  Nevertheless,  important  work  Is 
underway  that  will  hopefully  relate  pupillary  changes  to  other  more  easily  obtainable  physiological  and 
psychophysio logical  correlates  of  workload,  performance,  and  stress.  Before  turning  to  these  phycho- 
physlologlcal  correlates,  we  will  direct  lmediate  attention  aud  consent  to  some  of  the  remaining 
physiological  correlates,  namely  body  metabolites,  and  the  electromyograph. 

&  By  measuring  simultaneously  the  urinary  excretion  of  most  of  the  known  hormones.  It  has  been 
established  that  the  organism's  response  to  stress  Involves  a  total  neuroendocrine  apparatus.  As  Dukes- 
Dobos  (24)  has  stated,  these  hormones  can  be  divided  into  groups  according  to  their  excretory  pattern. 

One  group  of  hormones  Is  excreted  in  increased  amounts  during  the  stress  exposure  and  the  other  group 
shows  a  biphaalc  change  inasmuch  as  these  hormones  are  excreted  in  decreased  amounts  during  the  stress 
and  'n  increased  amounts  during  the  recovery  phape  Studies  performed  on  the  urinary  mucuproteins  suggest 
that  the  excretion  rate  of  this  substance  is  an  indicator  of  the  spend  tf  catabolic  processes  in  the  body 
reflecting  the  balance  of  the  total  neuroendrocrine  response  to  streu,*.  These  measures  while  important 
present  us  with  certain  problems  in  the  interruption  of  such  changes,  ^cr  instance,  we  only  know  that 
the  individual  has  been  stressed;  we  do  not  know  the  exact  time  in  wl.icu  the  stress  occurred.  Also,  a 
reduction  in  excretion  after  repeated  exposures  to  a  stress  may  be  due  to  either  adaptation  or  to  fatigue. 

Since  the  active  state  of  the  human  operator  is  connected  with  the  sympathetic  tonus,  one  could  assume 
that  the  hormones  of  the  sympatho-adrenal-medullary  system  (adrenalin  and  noradrenalln)  must  always  be 
excreted  in  increased  amounts  during  physical  exercise  or  mental  work.  However,  as  Dukes-Dobos  points  out, 
while  some  investigators  have  found  increases  in  one  or  the  other  cathecolmlnes  in  the  urine  after  physical 
exercise  or  physchologlcal  stress  others  did  not  find  such  changes  at  all.  One  reason  for  the  confusing 
results  may  be  that  the  blood-brain  baricr  permits  only  a  small  amount  of  noradrenalln  to  cross  through 
from  the  brain  to  the  blood  and  then  show  up  in  the  urine.  Therefore,  the  urinary  noradrenalln  level 
depends  upon  the  activity  of  the  peripheral  sympathetic  nerve  endings  which  may  or  may  not  be  related  to 
noradrenalln  release  in  the  brain.  Thus,  urinary  noradrenalln  does  not  give  a  reliable  estimate  of  the 
total  noradrenalln  excreted  in  the  sympathetic  nervous  system.  On  the  other  hand,  urinary  excretion  of 
adrenalin  may  reflect  completely  the  activity  of  the  adrenal  medulla.  According  to  the  classic  experiments 
of  Von  Euler  (25)  excretion  of  catecholmines  will  increase  after  physical  exercise  only  if  the  subject 
discovers  that  the  performance  requires  a  special  effort. 

In  one  of  the  many  studies  on  airplane  pilots  performed  by  Hale  (26)  urine  was  sampled  over  a  28-hour 
period  every  four  hours  from  the  crew  members  during  the  first  transatlantic  helicopter  flight.  The  flight 
was  a  risky  undertaking  and  bad  weather  conditions  often  threatened  its  success.  The  average  adrenalin 
and  noradrenalln  excretions  of  the  crew  members  were  elevated.  What  can  be  considered  a  unique  finding 
was  that  an  Increased  adrenalin  excretion  during  the  flight  was  observed  in  all  ten  of  the  subjects. 

Other  urinary  metabolites  measured  were  excreted  in  Increased  amounts  by  some  subjects  and  in  decreased 
amounts  by  others  compared  to  the  controls.  Thus,  adrenalin  excretion  seems  to  be  the  best  parameter 
for  accessing  the  magnitude  of  stress  brought  about  bv  a  task  which  is  not  demanding  as  far  as  physical 
exertion  is  concerned,  but  is  connected  with  stressful  work  conditions  and  is  in  fact  hazardous. 

Reflecting  Selye's  (27)  stress  concept,  physical  as  well  as  mental  work  can  be  considered  as  a  stress 
factor  which  may  evoke  the  general  adaptation  syndrome,  thus,  activating  the  pituitary-adrenal-cortical 
axis.  Many  studies  have  demonstrated  that  the  excreted  metabolites  of  this  endocrine  system  show  quanti¬ 
tative  changes  after  physical  exercise  as  well  as  psychological  stress.  This  mechanism  has  been  explored 
by  studies  utilizing  measurements  of  17-hydroxycorticosteroids  (17-OHCS).  In  general,  we  have  found  that 
during  periods  of  stress,  the  17-OHCS  excretion  increases;  however,  we  have  also  found  that  an  exposure 
of  day  'to-day  stress  may  bring  about  a  state  of  "chronic-adaptation"  fatigue  which  will  cause  a  drop  in 
17-OCHS  excretion  instead  of  an  elevation.  A  relatively  unique  tool  for  the  evaluation  of  17-OCKS  levels 
was  pioneered  and  developed  by  Shannon  (28)  using  parotid  fluid  collection  and  analyzing  these  samples  for 
free  17-OCHS  levels.  The  development  of  this  technique  is  very  interesting  to  follow  and  represents  a 
determined  effort  to  avoid  some  of  the  dangers,  discomforts,  nnd  logistics  problems  involved  in  collecting 
in-flight  specimens  of  blood  and  urine.  The  refined  technique  for  collecting  parotid  fluid  involves  a 
plastic  collecting  device  using  an  acrylic  bite-block  molded  to  the  individual  bite  of  each  subject.  This 
technique  allows  easy  and  rapid  self-positioning  of  the  device  over  the  parotid  duct  opening.  It  should 
be  noted  that  in  a  study  by  Warren,  Ware,  Shannon  and  Leverett  (29)  they  state  that  the  in-flight  parotid 
fluid  collection  technique  has  been  developed  to  the  point  where  it  represents  a  valuable  adjunct  for 
in-flight  physiological  studies.  This  is  especially  true  because  rises  in  steroid  levels  in  parotid  fluid 
does  not  demonstrate  the  lag  that  is  characteristic  of  urinary  steroid  responses. 

The  electrical  measurement  of  muscle  activity,  the  electromyograph,  has  a  long  history.  Most  of  the 
derived  muscle  physiology  relates  to  laboratory  studies  in  which  the  particular  muscle  of  muscle  group  has 
been  stressed  to  bear  maximum  muscle  contractions  during  a  relative  short  period  of  time.  Not  many  muscle 
studies  have  been  related  to  operational  purposes  where  muscles  are  stretched,  fatigued  or  otherwise  tasked 
over  long  periods  of  time  and  unique  workload  situations.  If  we  set  aside  the  work  of  those  various 


researchers  and  clinicians  interested  in  muscular  skeletal  relaxation  techniques  perhaps  the  most  important 
investigations  in  the  neurophysiology  of  muscle  has  been  done  by  Basmajian  (30).  We  will  not  review  all 
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control  that  can  he  obtained  by  huaar.s  to  indicate  that  thla  important  physiological  correlate  should 
not  rasiain  ignored  in  the  assessment  of  workload,  performance,  and  stresa. 

Baamajlan  has  developed  a  technique  and  the  necessary  instrumentation  in  the  form  of  bipolar 
intramuscular  electrodes  to  study  the  changing  patterns  of  activity  of  individual  motoneurons  through  the 
application  of  modified  electromyographic  methods.  In  a  series  of  studies  he  has  demonstrated  that  human 
beings  can  learn  to  activate  or  repress  any  number  of  spinel  motoneurons  in  a  given  pool.  The  human 
subject  can  also  learn  to  voluntarily  select  individual  motoneurons  and  to  control' the  firing  of  these 
neurons  through  the  assistance  of  auditory  and  visual  feedback.  Some  of  his  subjects  learned  such  exquisite 
control  over  these  individual  motoneurons  that  they  were  able  to  produce  various  rhythms  and  patterns  by 
deliberately  speeding  and  slowing  the  firing  of  the  individual  neuron.  Baamaj Ian's  technique  is  to  be  a 
promising  method  for  the  study  of  many  fundamental  phenomona  in  the  nervoua  system  relating  to  cortical 
and  s:  ''cortical  effects  upon  the  motoneuron  and  the  of  conditioning  and  learning.  One  other  Important 
area  of  investigation  would  be  that  of  pharmacological  agents  on  various  parts  of  the  motor  pathways  and 
muscle  activity  itself.  Of  a  more  applied  nature  in  the  area  of  electromyographic  Investigation,  Lafevers 
(31)  reports  on  a  work  task  performed  in  a  full  pressure  suit.  Lafevers  performed  a  power  spectral  density 
analysis  of  EMG  recordings  from  several  muscle  groups  involved  in  a  push-pull  task  at  various  reach 
positions  in  both  a  space  suit  and  in  shirt  sleeves.  He  feels  that  the  power  spectral  shifts  indicated 
significant  findings  relative  to  the  performance  and  stress  requirements  for  these  muscle  groups.  The 
reason  that  these  would  not  be  expected  to  appear  in  ordinary  electromyographic  determinations  is  the 
fact  that  the  task  requirements  of  the  study  were  not  of  a  fatiguing  nature  nor  did  they  stress  the  muscle 
groups  to  their  upmost.  This  type  of  work  suggests  many  areas  where  the  relationship  between  the  man- 
machine  Interface  in  terms  of  motor  activity  and  response  requirements  might  be  explored.  Here,  it  should 
be  possible  to  Identify  task  requirements  that  promote  muscular  fatigue  and  the  resultant  effects  of  this 
fatigue  on  both  man  and  the  particular  task  involved. 

Stress  Correlates:  The  stress  correlates  of  workload  and  performance  might  be  considered  the  environmental, 
operational,  and  internal  results  of  acute,  chronic  and  cumulative  effects  of  psychological  and  physio¬ 
logical  activities.  In  response  to  environmental  stressors  such  as  heat,  noise,  vibration  and  so  fosth, 
and  the  operational  requirements  of  a  particular  task,  duty,  or  mission  the  internal  environment  of  the 
human  begins  to  respond  in  a  more  or  less  predictable  fashion.  The  end  result  of  the  stress  correlates  of 
behavior  is  a  deterioration  in  activity  and  a  series  of  determinable  changes  which  we  usually  call  fatigue. 
Fatigue,  as  Grandjean  and  Kogl  (32)  report  in  their  introductory  remarks  to  the  Kyoto  symposium  on  methology 
of  fatigue  assessment,  is  a  subjective  sensation  in  many  ways  where  we  feel  not  only  tired  in  our  bodily 
parts  and  clumsy  in  psychomotor  activity,  but  we  feel  hampered  anil  inhibited  in  doing  either  physical  or 
mental  work.  This  inhibition  of  activity  continues  until  we  are  constrained  against  doing  any  form  of 
active  endeavor.  These  sensations  of  fatigue  can  be  assumed  to  have  a  protective  function  in  that  they 
force  us  to  avoid  further  stress  and  allow  recovery  to  take  place.  The  concept  of  fatigue  is  not  a 
popular  scientific  term  because  it  is  difficult  to  evaluate  and  to  quantify.  A  series  of  studies  reported 
by  Wolf  (33)  and  later  by  S.'ito  (34)  report  that  the  sensation  of  fatigue  has  three  major  components:  (1) 
a  sensation  of  bodily  tiredness  and  drowlness;  (2)  a  sensation  of  weakened  motivation  or  concentration 
"owards  a  task  and,  (3)  a  group  of  physical  complaints  that  relate  very  closely  to  what  are  commonly  called 
the  psychosomatic  disorders.  These  psychosomatic  complaints  are  usually  those  of  headache,  palpitations, 
tachycardia,  shortage  of  breath,  loss  of  appetite  and  indigestion  or  sleeplessness.  A  predominance  of 
these  kinds  of  complaints  is  usually  referred  to  as  clinical  fatigue.  In  the  presence  of  clinical  fatigue, 
absences  from  work  predominate  due  to  "illenss,"  and  there  arises  a  general  negative  attitude  towards  one's 

work,  one's  superiors  or  the  place  of  work  which  obviously  can  just  as  well  be  a  cause  of  clinical  fatigue 

as  well  as  be  a  result  of  it.  Compounding  the  problem  of  evaluating  chronic  fatigue  is  the  fact  that 
clinically  it  is  well  known  that  people  with  psychological  conflicts  and  difficulties  are  especially  prone 
to  this  state.  This  makes  it  difficult  to  separate  the  psychogenic  factors  from  exogenous  causes  of 
fatigue.  In  spite  of  the  fact  that  the  actual  components  of  fatigue  are  somewhat  difficult  to  sclentically 
quantify,  it  is  not  difficult  to  assume  that  the  consnonly  experienced  sensations  of  fatigue  are  very  likely 
a  biological  sign  of  the  necessity  for  man  to  enter  into  «  recovery  phase  by  informing  us  that  the  relative 

inflow  of  fatigue  is  exceeding  our  capacity.  As  Grandjean  and  Kogi  report  the  following  signs  are  observed 

in  conditons  of  chronic  fatigue:  (1)  a  general  weakness  and  drive  and  loss  of  initiative;  (2)  a  tendency 
to  depression  associated  with  unmotivated  worries;  (3)  increased  irritability  and  Intolerance  (occasionally 
exhibited  with  unsociable  behaviors). 

In  considering  the  stress  correlates  of  workload  and  fatigue  we  n^ed  tobe  aware  of  the  role  of  the 
activating  system  and  inhibiting  systems  of  the  CNS.  We  know  that  the  brain  contains  neural  structures 
responsible  for  maintaining  wakefulness  and  alerting  the  cortex.  It  has  been  shown  that  lesions  of  the 
medial  mid-brain  make  animals  inattentive  with  low  motivation  and  drowlness.  This  structure  Is  located 
in  the  reticular  formation  of  the  mid-brain  and  is  called  the  activating  system.  Stimulation  of  this 
system  arouses  the  individual  or  animal,  while  destruction  of  it  causes  the  animal  to  go  into  a  permanent 
coma.  There  are  also  neural  pathways  leading  impulses  from  the  ceberal  cortex  back  to  the  activating 
system.  These  corticofugal  pathways  converging  on  the  reticular  formation  have  a  function  similar  to  a 
feedback  system,  that  is,  impulses  originating  in  the  cortex  are  capable,  through  this  feedback,  of 
stimulating  the  ascending  reticular  activating  system  which  in  turn  maintains  the  cortex  and  the  behavior 
of  the  organism  in  a  state  of  arousal  and  alertness.  All  of  the  classical  afferent  pathways  coming  from 
the  sensory  organs  send  collateral  Impulses  to  the  reticular  activating  system.  This  means  that  impulses 
from  the  environment,  through  the  sense  organs,  or  from  muscle  activities,  can  stimulate  the  ascending 
activating  system  and  thereby  increase  cortical  activity.  Though  there  is  recent  evidence  that  lateral 
brain  stem  regions  are  as  important  as  medial  regions  for  attention  and  arousal,  it  is  generally  admitted 
that  unspecific  neurons  decisively  regulate  arousal  and  attention.  Related  to  the  neurological  aspects  of 
workload/fatigue  are  other  investigations  that  have  shown  that  stimulation  of  the  activating  system  can 
spread  to  the  autonomic  nervous  system  giving  rise  to  hormonal  changes  in  the  internal  organs  such  that 
the  organism  may  poise  itself  for  energy  expenditure. 

The  work  of  Hess  (35)  showed  that  electrical  st  Lmulation  through  chronically  implanted  electrodes 
produced  a  tendency  to  fall  asleep  and  to  produce  pronounced  muscular  relaxation  in  cats.  This  discovery 
later  confirmed  by  many  others  has  shown  a  active  inhibition  mechanism  which  spreads  from  the  subcortical 
structures  to  the  ceberal  cortex  and  actB  to  depress  cortical  functions.  These  systems  have  a  direct 


depressing  Influence  on  the  ascending  reticular  activating  systems.  Therefore,  cortical  inhibition  can 
result  from  two  different  causes.  On  the  one  hand,  cortical  activity  may  decrease  as  a  result  of  lowered 
sensory  inputs  or  a  lowered  corticofugual  feedback.  This  might  be  called  a  passive  inhibition.  On  the 
other  hand,  cortical  activity  can  be  reduced  by  an  active  inhibitory  function  which  would  be  elicited  by 
Increased  activity  of  the  inhibitory  system.  It  is  interesting  to  note  that  we  see  changes  in  the  brain 
wave  which  Involve  a  flattening  of  electrical  activity  which  are  associated  with  suppressed  behavior  In 
both  fatigue,  in  states  of  chronic  anriety,  and  in  certain  drug  affects  which  act  to  suppress  the  central 
nervous  system.  It  is  Important  also  to  remember  that  the  organism  regulates  its  feelings  of  fatigue  or 
relative  arousal  not  only  through  the  neuralmechanlsms,  but  through  endocrine  factors  which  are  ultimately 
responsible  for  maintaining  a  certain  functional  state  for  hours  or  longer  periods  of  time. 

In  spite  of  the  neurological  and  endocrinologlc  relationships  discovered  snd  understood,  it  is  still 
a  problem  for  ns  to  remember  that  fatigue  ia  atill  subjectively  evaluated.  In  other  words.  Just  because 
the  conditions  of  fatigue  exist  does  not  necessarily  mean  that  performance  decrement  occurs.  It  may  be 
true  that  subjective  feelings  will  precaed  a  loss  in  performance  ability,  but  not  necessarily.  It  is  well 
known  that  in  spite  of  great  fatigue  the  human  organism  will  response  with  adequate,  if  not,  lifesaving 
performance  levels.  Further,  we  know  that  there  is  a  situation  of  unacceptable  fatigue  which  people 
classify  as  a  kind  of  fatigue  called  overwork,  overload,  or  exhaustion  or  other  kinds  of  terms.  Since 
these  kinds  of  fatigue  concepts  are  related  to  subjective  judgment  we,  therefore,  get  back  to  the  psycho- 
physiological  implications  of  workload,  performance,  snd  stress.  We  will  see  that  some  investigators 
in  assessing  fatigue  feel  that  there  are  Indexes  of  fatigue  through  physiological  measures  such  as  Increase 
in  heart  rate,  reduction  of  sinus  arrythmia,  and  so  foith.  Regardless  of  whether  one  is  concerned  with 
the  physical  aspects  of  fatigue  or  the  mental  aspects  of  fatigue,  certain  symptoms  can  be  considered  as  a 
consequence  of  cortical  inhibition  activity.  The  following  symptoms  of  what  might  be  called  "cortical 
fatigue"  are  those  which  need  to  be  evaluated,  (1)  decrease  of  attention,  (2)  slow  and  impaired  perception, 
(3)  impairment  of  thinking,  (4)  decreased  motivation,  (3)  decreased  performance  speed,  (6)  decreased 
accuracy,  and  (7)  decreased  performance  reserve  for  physical  and  mental  activity.  While  most  of  these 
factors  have  been  Investigated  to  some  extent  it  Is  probably  the  electrical  activity  of  the  cortex  which 
may  give  a  better  picture  of  activity  which  can  be  considered  as  having  a  direct  regulatory  effect. 

In  a  factor  analytic  study  of  mental  fatigue,  Kogl  and  Saito  (36)  were  able  to  demonstrate  that 
certain  changes  in  cortical  functions  were  related  to  various  phases  of  a  24-hour  period.  The  measure  of 
cortical  activity  they  selected  was  the  critical  flicker  fusion  test,  but  changes  In  CFF  were  also 
reinforced  by  changes  in  a  choice  reaction  time  test. 

A  study  by  Ettema  and  Zlelhuis  (37)  Investigating  the  physiological  parameters  of  mental  load  demon¬ 
strated  that  a  simple,  binary  choice  test  providing  different  mental  loads  or  levels  of  difficulty, 
showed  systematic  changes  in  heart  rate,  sinsus  arrhythmia,  systolic  and  diastolic  blood  pressure,  and 
rate  of  respiration. 

Kashi vagi  (38)  was  able  to  constuct  a  fatigue  rating  scale  which  allows  a  Judgment  of  human  fatigue 
through  a  person's  appearance.  The  use  of  such  a  scale  might  be  very  helpful  in  the  field  as  far  as 
management  or  field  conmanders  are  concerned  and  might  be  of  use  to  some  of  the  mission  crew  fatigue 
studies  done  at  the  School  of  Aerospace  Medicine  by  William  Storm  and  his  colleagues.  Presently,  subjective 
fatigue  and  sleep  data  are  collected  from  various  mission  groups.  These  measures  are  used  to  assess  the 
overall  effects  of  mission  requirements  upon  sleep  loss  and  workload  requirements  (39). 

A  system  using  a  concept  of  task-induced  stress  was  developed  by  this  author  and  used  in  the  stress 
testing  of  special  mission  personnel  in  the  U.S.  Air  Force  (40).  Thir  concept  was  structured  around  the 
tasks  an  operator  must  perform  in  an  advanced  space  system.  He  must  perform  a  relatively  large  array  of 
discrete,  discontinuous  operations  against  a  background  of  monitoring  and  information  processing  tasks. 

In  terms  of  information  theory,  the  discrete,  discontinuous  functions  would  constitute  a  source  of  noise 
in  the  form  of  unwanted  or  distracting  signals  when  the  operator  was  trying  to  monitor  and  process  a 
continuous  input.  Increasing  the  signal  rate  of  the  discontinuous  tasks  makes  the  detection  and  identi¬ 
fication  of  the  continuous  taBk  mote  difficult  in  the  same  manner  that  increased  noise  acts  to  degrade 
audible  signal  detection  and  recognition.  By  structuring  the  task  situation  so  that  the  operator  is 
uncertain  as  to  what  is  signal  and  what  is  noise,  it  is  possible  to  cause  him  to  continually  shift  his 
attention  from  signal  to  noise  and  noise  to  signal.  This  is  the  natura  of  competing  tasks  and  the  end 
result  of  such  a  situation  can  br.  regarded  as  task-induced  stress.  By  further  structuring  the  situation 
ao  that  the  operator  has  been  allowed  to  find  out  that  he  can  in  fact  perform  both  the  discrete,  discon¬ 
tinuous  task  and  the  continuous  monitoring  task  independently,  he  is  quite  apt  to  assume  that  he  should 
be  able  to  do  them  together  with  perhaps  only  a  little  more  effort.  When  he  finds  out  that  he  has  much 
difficulty  doing  both  tasks  he  is  led  to  conclude  that  there  is  something  wrong  with  him  and  he  would 
much  better  If  he  could  only  1' Ina  the  optimal  technique  or  "a  giaxnick" .  This  sort  of  structuring  tends 
to  invite  the  formation  of  internalized,  psychologic  stress  which  is  not  relievo',  much  by  hostility 
towards  the  tasks  themselves.  Since  there  is  no  obvious  source  of  the  proficiency  problem  presented  by 
the  competing  task,  the  psychologic  feelings  generated  by  failure  to  perform  well  tend  to  be  self-directed 
rather  than  task  directed.  In  this  situation,  the  Induced  stress  is  more  than  the  sum  of  the  stress  of 
performing  each  task  independently.  The  results  indicated  that  a  criterion  group  of  those  finally 
selected  for  the  special  mission  using  various  other  criteria  was  better  able  to  adapt  to  the  two  competing 
tasks  and  was  less  susceptxbls  to  the  signal  noise,  ambiguity  and  the  induced  task  stress  than  the  special 
mission  personnel  group  as  a  whole. 

It  must  be  noted  that  the  evaluation  of  highly  specialized  groups  selected  by  virtue  of  years  of 
experience  and  special  talents  presents  a  unique  problem  to  those  of  us  charged  with  the  renponsibilltes 
of  investigating  the  effects  workload,  performance,  and  stress  within  and  upon  the  human  operator. 

Another  important  area  of  performance  decrement  in  relationship  to  environmental  and  operational 
working  conditions  is  that  of  exposure  to  hot  and  humid  environments.  In  an  important  study  reported  by 
Welch,  Longley  and  Lomaev  (41)  they  found  that  sweat  lo3S  and  pulse  rate  were  found  to  be  unreliable 
methods  of  measuring  fatigue  and  that  skin  temperature  was  completely  unreliable  as  an  index  of  fatigue 
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except  when  the  temperature  and  humidity  were  high.  Their  experiments  confirmed  that  a  rectal  temperature 
of  38.8°  C  to  38.9  C  will  in  moat  cases  coincide  closely  with  the  onset  of  actual  exhaustion.  Another 
study  in  this  area  by  Grlvel  (42)  indicates  that  the  permanent,  specific  heat  effects  on  psychomotor  and 
mental  performance  are  related  to  preferentially  in  that  different  aspects  of  the  seme  activity  were 
considered  to  determine  the  effects  of  climatic  stress.  In  other  words,  heat  acts  differently  on  the 
reactivity  aspects  of  performance  than  those  aspects  of  performance  associated  with  continuous  attention. 

In  the  field  of  time-varying  heat  effects,  studies  have  examined  the  possible  transitory  effects  of  heat 
as  well  as  long  term  evolution  of  effects  found  at  the  time  of  first  heat  exposure.  Here,  a  sequence 
of  events  can  be  distinguished,  each  characterized  by  a  particular  kind  of  ambient  heat  effect  upon 
performance.  This  suggesti  some  type  of  learning  or  conditioning  takes  place  relevant  to  heat  stress. 

The  Psychophyslologlcal  Correlates:  We  have  seen  so  far  that  we  have  both  common  and  scientific  knowledge 
that  individual  and  combined  streses,  both  physiological  and  psychological,  can  adversely  effect  mental 
performance  and  judgement  as  well  as  physical  performance.  We  have  just  discussed  how  both  hyperthermia 
and  fatigue  can  produce  deteriorated,  objective  judgements  regarding  environmental  situations  as  well  as 
degraded  performance.  The  Individual's  subjective  Judgement  or  insight  concerning  the  quality  of  his  own 
performance  is  similarly  degraded.  With  sustained  exposure  to  stress,  he  tends  to  overestimate  hiB 
capability  and  to  discount  his  errors.  Subjective  identification  of  degraded  central  ne.rvous  system  (CNS) 
function  is  generally  based  not  on  recognition  of  degraded  performance,  but  on  secondary  indicators  such 
as  dimming  of  vision  in  case  of  hypoxia  and  reduced  span  of  attention  in  the  case  of  fatigue  just  to  use 
one  example.  We  are  primarily  Interested  in  looking  at  psychophyslologlcal  parameters  which  relate  to 
central  nervous  system  function.  Our  interest  stems  from  the  fact  that  to  the  extent  that  these  CNS 
changes  are  detectable  through  analysis  of  peripheral  physiologic  measures  they  can  provide  a  sort  of 
warning  system  of  primary  higher  CNS  functional  decrement  in  the  same  sense  that  an  oxygen  partial 
pressure  meter  provides  primary  hypoxia  warning. 

Many  years  ago  the  Cambridge  cockpit  series  of  performance/fatigue  studies  examined  behavioral 
changes  observable  through  several  hours  of  continuous  performance  when  subject's  "flew"  a  specially 
instrumented  simulated  aircraft.  Bartlett  (43)  stmoarized  these  studies  in  terms  of  skill-fatigue  effects. 
The  experiments  showed  beyond  question  that,  under  the  conditions  Imposed,  "operator  fatigue"  does  occur 
though  in  most  cases  the  operator  himself  did  not  realize  it.  Within  a  maximum  of  8  hours  of  simulator 
operating,  the  experimenter  concluded  that  the  subjects  were  still  able  to  perform  the  operations,  but 
only  if  they  were  especially  careful  to  avoid  known  deficiencies  characteristic  of  fatigue.  Some 
inexperienced  subjects  developed  significant  deterioration  of  performance  after  only  lk  hours.  One  highly 
motivated,  experienced  subject  went  8  hours  without  appreciable  deterioration.  Most  subjects,  as  fatigue 
progressed,  showed  a  lack  of  coordination  between  the  recognition  of  the  required  operation  and  the 
necessary  response.  This  is  related  most  logically  to  impairment  of  the  integrative  function  of  the 
association  areas  of  the  cerebral  cortex.  Marked  Increases  in  lability  and  irritability  were  also  observed 
along  with  changes  in  judgement.  As  time  progressed  during  the  performance,  the  number  of  small  errors 
increased,  but  were  later  replaced  by  large  errors.  T’’is  waB  interpreted  as  reflecting  degraded  neuro¬ 
muscular  control  with  Increasing  levels  of  frustration.  This  was  compensated  for  by  a  judgemental  change 
in  the  standards  of  accuracy.  The  subjects  were  unaware  of  this  change  in  their  judgement  of  acceptable 
performance  unless  it  was  called  to  their  attention.  This  idea  of  subjective  lack  of  awareness  is  crucial 
in  the  operation  of  high  performance  man-machine  systems. 

Other  studies  of  fatigue  have  demonstrated  that  subjects  become  tired  of  a  specific  task  and  show 
rejuvenated  performance  upon  changing  tasks.  From  a  neurophysiological  standpoint,  this  relates  to  a 
reduced  level  of  general  CNS  activity  upon  habituation  to  a  monotenous  task.  However,  with  the  introduc¬ 
tion  of  a  novel  stimulus  there  is  a  marked  increase  in  CNS  activity.  This  is  the  so  called  arousal  or 
activation  response.  Subjective  appraisal  of  performance  and  its  relation  to  objective  criteria  under 
conditions  of  fatigue  produced  by  prolonged  wakefulness  using  skin  resistance  measurements  as  well  as  EEG 
tracing  has  been  reported  by  Burch  and  Greiner  (44).  Generally,  they  found  the  subjective  evaluations 
showed  a  high  correlation  with  their  bioelectrical  measurements  during  pre-experimental  control  periods 
and  the  earlier  portions  of  fatigue.  However,  as  the  fatigue  progressed  the  subject's  ability  to  evaluate 
his  own  state  of  consciousness  begins  to  break  down. 

Several  studies  of  mild  hypoxia  at  the  School  of  Aerospace  Medicine  have  shown  frequent  lapses  in 
simple  performance  tasks  lasting  only  a  few  seconds  and  suggestive  of  a  momentary  loss  of  awareness.  In 
fact,  one  'f  the  tasks  incorporated  into  a  multielement  psychomotor  test  device  previously  referred  to  as 
Neptune  (an  acronym  for  "neuropsychiatric  test  unit")  was  designed  to  provide  a  relative  measure  of 
operator  consciousness  during  periods  of  experimentally  induced  hypoxia.  This  task  was  called  Auditory 
Monitor  and  involved  monitoring  three  Morse  Code  signals  "A",  "N,"  and  "M"  which  are  played  in  random 
order  at  a  preselected  speed.  This  task  indicated  momentarily  loss  of  awareness  and  these  periods  of 
loss  correlated  with  EEG  changes  indicating  reduced  cortical  arousal. 

The  stress  of  sleep  deprivation  also  shows  brief,  often  dramatic  intermittent  pauses  or  lapses  in 
ongoing  behavior.  Many  studies  show  that  these  lapses  increase  in  frequency,  duration,  and  depth  as 
sleep  loss  increases.  Between  lapses  subjects  ate  able  to  think  and  act  under  challenge  almost  as  well 
as  under  preexperimental  conditions.  As  lapses  deepen,  it  is  Increasingly  difficult  for  the  subject  to 
hold  a  stable  frame  of  reference  while  performing  a  series  of  mental  operations.  If  a  deep  spell  of 
drowsiness  occurs  in  the  middle  of  a  serial  operation,  the  subject  will  stop  the  sequence  for  a  brief 
time  and  often  'oose  track  of  where  he  had  been  in  the  series.  Luby  (45)  has  attributed  decrement  14 
psychological  test  performance  under  conditions  of  118  to  120  hours  of  sleeplessness  to  fluctuations  of 
attention.  T..e  frequency  of  periods  of  inattention  increases  as  a  function  of  the  hours  of  sleep  loss. 

In  comparing  measures  of  physiologic  activity  with  observed  behavior  it  is  necessary  to  note  that 
an  organized  [sychomotor  response  pattern  involves  three  factors  which  must  be  integrated  at  a  relative 
high  cortical  level.  These  factors  are:  (1)  detection  of  the  signal,  (2)  selection  of  the  response,  and 
(3)  execution.  Under  conditions  of  disorganization  the  response  pattern  is  fractionated  ratner  than 
coordinated  so  that  certain  indications  of  response  pathology  occur.  For  example,  under  stress  conditions 
detection  may  be  accompanied  by  a  startle  response  of  varying  degree  to  signals  having  high  attention 
value.  Signal  detection  may  be  degraded  by  a  breakdown  of  scanning  behavior  as  the  attention  span  is 
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attenuated.  The  wrong  response  say  be  selected  and  execution  way  be  characterized  by  gross  spatial  errors 
and  psychomotor  movement,  that  Is,  moving  first  to  the  general  area  of  the  control  and  then  to  the  control 
Itself.  Other  execution  errors  Involve  operating  the  wrong  control  or  using  the  proper  control  Incorrectly. 

In  summary  then,  we  see  that  various  physiological  and  psychologic  stressors  Individually  produce 
variable  degrees  of  decrement  and  behavioral  performance,  some  of  which  are  predictable.  However,  in 
combination,  the  effects  of  these  stressors  on  performance  become  difficult,  if  not  Impossible  to  predict. 
Nevertheless,  It  is  possible  to  monitor  neurophysiological  states  and  events  to  the  extent  that  It  Is 
possible  to  Identify  CNS  functional  changes  related  to  the  primary  cause  of  performance  decrement.  At  the 
present  state  of  the  art,  the  likelihood  of  detecting  a  specific  erroneous  Judgement  using  CNS  functional 
criteria  Is  not  possible.  However,  it  is  possible  at  present  to  detect  specific  mental  activity  related 
to  specific  external  and  internal  processing  events.  He  will  discuss  these  at  a  later  point.  At  present 
we  will  deal  with  the  identification  of  a  CNS  functional  state  correlated  with  unreliable  or  pathologic 
performance  and  Judgement.  It  is  interesting  to  speculate  upon  the  fact  that  many  of  the  early  symptoms 
of  some  organic  brain  diseases  and  some  focal  brain  disorders  or  net  unlike  some  of  the  behavioral  changes 
Just  mentioned  and  yet  to  date  no  orgauized  study  has  been  made  relating  psychophysiologlcal  variables 
with  symptomatic  or  psychometric  factors  in  such  disorders. 

rsvchophyslologlcal  Monitoring  of  CNS  Function 

The  physiologic  evaluation  of  central  nervous  system  function  can  be  approached  from  two  points  of 
view:  Cl)  the  general  state  of  arousal  or  level  of  consciousness,  and  (2)  the  quantitative  and  qualitative 
aspects  of  indlvudal  or  specific  eNS  responses.  We  will  discuss  the  first  approach  from  a  neurophysiological 
standpoint  and  relate  It  to  some  existing  data  under  the  title,  "General  Levels  of  CNS  Activation".  The 
other  approach  will  be  discussed  under  the  heading,  "Individual  CNS  Response." 

Be core  delving  into  CNS  monitoring  Itself,  we  should  be  aware  of  some  of  the  considerations  in  terms 
of  measures  and  analysis.  Individual  physiologic  measures  can  be  analyzed  from  at  least  two  aspects:  (1) 
averaged  or  Integrated  values,  and  (2)  quantitative  analysis  which  reveal  changes  due  to  individual  CNS 
responses.  These  methods  have  been  devised  to  permit  a  reduction  of  data  to  speed  analysis  and  to  allow 
for  computer  handling  of  the  data.  Four  such  measures,  the  electroencephalogram,  the  electrocardiogram, 
respiration  and  electrodermal  responses  will  be  discussed  in  some  detail.  In  each  case  we  will  try  to 
relate  the  two  kinds  of  analysis  to  the  two  corresponding  types  or  modes  of  CNS  function. 

The  Electroencephalogram:  The  potentials  observed  from  scalp  electrodes  measure  a  part  of  the  electrical 
activity  that  underlies  superficial  cerebral  cortex.  The  specific  areas  of  cerebral  cortex  are  iden¬ 
tified  with  primary,  sensorimotor  activity  and  with  the  Integrative  function  of  the 'association  areas 
adjacent  to  these  sensory  areas.  The  sensory  association  areas  piny  a  major  role  In  deriving  meaning 
from  the  Impulses  received  In  the  primary  sensory  areas.  The  frontal  area  contributes  to  the  Integration 
of  the  sensory  association  areas  permitting  abstract  and  conceptualization.  One  can  thus  expect  observable 
changes  In  EEG  patterns  relating  to  changes  In  activity  Involving  these  higher  mental  functions.  Much  of 
the  problem  in  intrepretlng  EEG  patterns  Is  due  to  the  highly  complex  wave  forms  developed.  Most  forms  of 
analysis  have  been  borrowed  from  engineering  approaches  to  vibration  stress  which  also  presents  multiple 
frequency  wave  forms.  In  engineering  terms  this  is  called  frequency  spectrum  analysis  in  which  the  various 
frequencies  are  partialled  or  split-out  for  individual  analysis  for  a  specific  time  period.  A  modification 
of  this  approach  developed  by  Burch  (46)  shows  promise.  The  Burch  method  analyzes  BEG  wave  forms  in  the 
time  domain  expressed  as  major  and  minor  periods.  The  major  period  represents  the  dominant  EEG  frequency 
for  a  specified  time  interval  and  is  defined  by  baseline  crosses  of  the  raw  EEG.  The  minor  period  repre¬ 
sents  the  superimposed  waves  between  the  baseline  crosses.  Major  and  minor  periods  are  each  summed  and 
represented  as  a  total  count  during  a  time  interval  or  epoch  such  as  10  seconds.  The  Burch  method  involves 
an  additional  display  referred  to  as  spectral  analysis.  This  divides  the  raw  EEG  spectrum  for  each  given 
epoch  into  10  frequencies  bands  with  reference  to  both  major  and  minor  periods.  This  analysis  a-  a  in 
a  form  similar  to  the  Grey  Halter  frequency  analysis  system.  The  readout  provides  a  value  for  '  of  the 

10  major  and  minor  period  frequency  bands  during  every  epoch.  The  amplitude  of  this  writeout  1  ates  the 
total  time  in  the  preceding  epoch  during  which  the  analyzer  detected  periods  with  values  falling  within 
the  frequency  limit  assigned  to  that  particular  band.  The  model  frequency  band  for  either  the  major  or 
minor  period  is  that  band  in  which  the  greatest  accumulated  time  is  scored  during  the  epoch.  A  total  of 
major  and  minor  period  counts  represents  a  characterization  or  signature  of  the  EEG  frequency  spectrum 
during  the  epoch  selected.  Total  counts  over  epoch's  of  a  few  seconds  reflect  individual  reflexes  Involving 
a  major  portion  of  the  cortex.  Changes  in  modality  of  the  frequency  spectrum  during  the  corresponding 
epochs  are  related  to  the  quantitative  aspects  of  such  reflexes. 

In  contrast  to  tne  previously  described  frequency-period  analysis  which  is  concerned  with  time  domain 
only,  Rlehl  (47)  developed  a  method  which  relates  both  time  and  amplitude  domains.  He  defined  an  activation 
response  which  he  called  Ua.  This  can  be  written  in  the  form  of  a  equation  where  Ua  equals  F  (the  dominant 
frequency)  multiplied  by  the  reciprocal  of  the  average  amplitude.  In  this  equation  the  dominant  frequency 
is  defined  for  the  major  period  count  and  the  average  amplitude  is  that  which  is  obtained  by  full  wave 
rectification  and  integration .  In  order  to  derive  this  function,  an  analog  computer  is  employed  to  integrate 
the  wave  form,  and  to  obtain  a  continuous  real-time  representation.  The  integration  of  Ua  over  epochs  of 
10  seconds  recorded  at  a  relatively  slow  chart  speed  provides  a  convenient  readout  of  an  activation  response 
over  specific  time  periods.  The  Ua  itself  will  exhibit  major  fluctuations  of  only  a  few  secondo  duration. 
These  can  be  evaluated  as  identifiable  CNS  responses  to  known  stimuli. 

A  more  recent  approach  is  to  use  power  spectrum  analysis  using  a  fast  Fourier  transform.  This  yields 
a  representation  of  extremely  small  power  shifts  over  very  small  epochs.  This  result  can  be  obtained 
on-line  in  terms  of  percentage  of  power  in  each  selected  frequency  band  width  or  in  terms  of  pure  pow>r. 

The  Electrocardiogram:  Nervous  control  of  heart  rate  is  classically  described  as  mediated  through  vagal 
parasympathetic  cardie -inhibitor  fibers  and  through  sympathetic  cardloaccelerator  fibers.  The  vagus 
nerve  cardlo-lnhibitor  fibers  originate  in  the  bilaterally  paired  dorsal  motor  nuclei  of  the  Vagus.  These 
nuclei  lie  in  the  floor  of  the  fourth  ventricle  throughout  most  of  the  length  of  the  medulla  oblongata. 
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substance  of  the  Medulla.  Best  by  beat  values  of  heart  rate  are  obtained  by  measuring  the  period  (R-R 
Interval)  of  each  cardiac  cycle.  An  analysis  of  heart  rate  or  trend,  or  accelerator,  vs  decelerator 
Information  may  be  obtained  by  averaging  the  frequency  of  a  number  of  cardiac  cycles.  Tills  is  most 
conveniently  done  by  using  a  cardlotachometer .  The  beat  by  beat  analysis  of  cardiac  rate  represents 
a  very  promising  method  of  observing  Individual  CNS  reflex  responses.  This  analysis  shows  two  contrasting 
patterns:  (1)  during  sleep,  the  record  consists  almost  exclusively  of  a  rhythmic  Increase  end  decrease 

of  heart  rate  coincident  with  respiration.  This  is  referred  to  as  respiratory  coupling.  (2)  During  periods 
of  wakeful  sensory  motor  activity,  such  as  speaking,  walking,  etc.,  the  beat  by  beat  pattern  of  heart  rate 
shows  a  preponderance  of  nonresplratory  coupling  or  decoupling  showing  frequent  cardioaccelerator  reflexes 
es  opposed  to  those  seen  only  occasionally  during  sleep.  The  number  of  premature  ventricular  contractions 
per  unit  time  is  observed  to  increase  under  conditions  of  stress.  Other  specific  electocardiovascular 
changes  have  been  reported  in  the  literature,  but  at  this  time  is  not  yet  clear  whether  these  are  a 
function  of  direct  nervous  control  or  indirect  huaorlal  influences. 

Respiration:  Control  of  respiration  is  mediated  through  autonomic  and  voluntary  pathways.  The  primary 
respiratory  centers  lie  in  the  medulla  oblongata  and  in  the  pons.  The  medullary  centers  are  described 
as  paired  bilateral  half-centers  which  include  both  an  inspiratory  and  expiratory  half-center  on  each 
side.  The  half-centers  are  contained  within  the  medullary  reticular  substance.  The  pontine  reticular 
formation  contains  an  inhibitory  pneunotaxic  center  and  apneuistic  center  which  exerts  a  strong  tonic 
effect  on  the  bulbar  inspiratory  center.  Voluntary  control  of  respiration  originates  in  the  cerebral 
cortex  and  is  mediated  through  the  hypothalamus.  Both  inhibitory  and  acceleratory  cortical  influences 
appear  most  specifically  localized  in  the  frontal  cortex. 

It  is  interesting  to  note  that  the  medullary  centers  for  respiratory  control  and  cardioaccelerator 
control  lie  close  to  each  other  within  the  medullary  reticular  formation.  Thus,  it  is  not  surprising 
that  there  should  be  a  strong  interaction  between  respiratory  activity  and  heart  rate.  As  we  have  noted 
during  quiet  periods  of  CNS  activity  heart  rate  is  predominantly  coupled  to  the  respiratory  cycle  while 
during  periods  of  CNS  arousal  the  respiratory  coupling  is  frequently  replaced  or  decoupled  by  cadlo- 
accelerator  reflexes  associated  with  brief  respiratory  arrest.  Similar  transient  increases  in  heart  rate 
are  occasionally  associated  with  a  marked  increase  in  respiratory  rate.  This  observation  suggests  that 
the  cardioaccelerator  reflexes  may  derive  from  two  clearly  distinguishable  neurophysiological  mechanisms. 

In  a  comprehensive  review  of  sinus  arrhythmia  reflex  mechanisms,  Heymans  cites  clear  evidence  that 
in  lower  animals  the  cardiac  vagal  center  is  subject  to  two  inhibitory,  cardioaccelerator  influences,  one 
arising  from  the  lungs  exhibits  increasing  activity  with  mild  pulmonary  inflation  and  the  other  is  mediated 
directly  from  the  respiratory  center  (48).  The  latter  influence  is  fully  capable  of  producing  typical  sinus 
arrhythmia  in  the  complete  absence  of  pulmonary  ventilation.  This  fact  makes  is  reasonable  to  postulate 
changes  in  heart  rate  arising  from  cortical  activity  mediated  directly  through  respiratory  centers  to  the 
cardiac  vagal  center. 

Electrodermal  Responses:  Electrodermal  responses  (EDR)  which  include  galvanic  skin  responee  and  the  basal 
skin  response,  are  predominantly,  if  not  exclusively,  mediated  by  the  sympathetic  nervous  system  which 
produces  changes  in  skin  resistance  highly  correlated  with  sweat  gland  activity,  the  so-called  galvanic 
skin  reflex. 

A  comprehensive  review  of  galvqnlc  skin  reflex  neurophysiology  by  Hang  discusses  stimulation,  trans- 
section,  and  oblation  techniques  employed  in  the  CAT  to  identify  CNS  excitatory  and  inhibitory  centers. 

The  suprasegmental  excitatory  areas  include  the  sensorimotor  area  of  the  cerebral  cortex,  the  hypothalamus 
of  the  diencephalon,  and  the  facilitatory  reticular  system  in  the  diencephalon  and  mesencephalon.  Two 
pathways  of  the  GSR  which  are  separate  at  the  cortical  and  diencephalic  level  converge  on  the  preganglionic 
sympathetic  sudomotor  neurons  in  the  spinal  cord.  The  facilitatory  influence  of  the  diencephalic  and 
mesencephalic  reticular  system  is  characterized  as  follows:  When  the  facilitatory  reticulcr  system  is 
stimulated  in  both  the  interbrain  and  the  mldbraln,  the  response  or  effect  varies  with  the  strength  of 
the  stimulating  current.  Heak  current  elicits  no  response  itself,  but  augments  the  reflex.  Moderately 
strong  currents  evoke  a  small  response  itself  and  also  enhances  the  reflex.  A  very  strong  current  which 
calls  forth  a  large  response  by  itself,  suppresses  the  reflex  during  and  immediately  after  stimulation, 
but  has  a  late,  long  lasting  facilitatory  effect  on  the  reflex.  This  effect  begins  one  minute  after 
stimulation,  reaches  a  peak  in  two  or  three  minutes  and  then  gradually  declines  to  zero  in  30  to  40  minutes. 

The  inhibitory  centers  identified  include  the  frontal  ceberal  cortex,  the  caudate  nucleus,  the  anterior 
cerebellar  lobe  and  the  bulbar  medial  reticular  formation.  The  cerebral  cortex  has  the  least  inhibitory 
effect  and  the  bulbar  medial  reticular  foruution  the  strongest. 

A  number  of  stimuli  characteristically  elicit  the  GSR  on  selected  sites  of  human  skin.  These  include 
startle,  painful  or  other  strong  sensory  stimuli,  violent  respiratory  activity,  generalized  muscular 
activity,  and  strong  emotional  stimuli.  GSR  activity  during  arousal  conditions  occur  spontaneously  in 
response  to  no  apparent  stimulus.  Tnis  Is  the  so-celled  nonspecific  GSR.  GSR's  which  occur  in  response 
to  known  stimuli  are  then  called  specific  GSR'a.  The  sensitivity  to  stimuli  evidenced  by  the  frequency 
and  magnitude  of  GSR's  are  observed  to  fluctuate  through  relatively  wide  ranges  in  the  course  of  normal 
dally  activity.  This  suggests  a  threshold-type  mechanism,  that  characteristically  affects  a  wide  span 
of  control. 

Individual  GSRs  are  characterized  by  a  transient  drop  in  skin  resistance  in  a  period  of  a  few  seconds. 
This  expression  of  reflex  activity  has  been  analysed  in  terms  of  the  number  of  responses  per  unit  time,  of 
the  amplitude  of  the  individual  responses,  and  response  latency  or  the  period  of  time  between  an  administered 
stimulus  and  the  onset  of  the  GSR.  Some  time  ago  this  author  demonstrated  that  the  area  subtended  by  the 
recorded  GSR,  a  measure  which  integrates  both  time  and  amplitude  is  a  move  sensitive  indicator  of  rtress 
than  frequency  or  amplitude  only  (49;. 


Basal  skin  resistance  or  BSR  varies  slowly  over  c  wide  range  ns  the  individual  fluctuates  through 
states  of  consciousness  on  the  sleep-arousal  contlnum.  High  resistance  values  are  sat  elated  with  low 
levels  of  consciousness  such  as  sleap,  and  low  raslatancn  values  with  high  levels  ac  with  intansa 
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excitement.  Basal  skin  resistance  has  been  observed  to  vary  over  a  range  of  10  to  1  In  a  period  of  10 
minutes  during  the  period  of  transition  from  sleep  to  aroused  vakefullness  In  the  morning.  The  BSR  tends 
to  be  lower  during  periods  of  frequent  GSR  and  high  during  periods  of  Infrequent  GSR.  The  reduction  of 
BSR  during  frequent  GSR  is  apparently  due  to  the  cumulative  drop  in  resistance  resulting  from  the  failure 
of  complete  recovery  of  the  response  mechanism  to  the  prereflex  level  of  resistance  before  the  onset  of 
the  next  stimulation.  This  same  phenomenon  is  seem  in  the  repeated  stimulation  of  other  neural  response 
mechanisms.  The  relationship  between  BSR  and  GSR  activity  is  apparently  a  function  of  the  rate  and 
magnitude  of  Individual  reflexeo  and  the  recovery  rate  of  the  skin  resistance  towards  higher  values. 

Thus,  BSR  csn  be  seem  as  a  form  of  integrated  function  of  GSR  activity.  Analysis  of  both  forms  of 
electrodural  activity  provides  further  quantitative  and  qualitative  Information  concerning  some  aspects 
of  CNS  reflex  activity. 

From  this  discussion,  it  is  apparent  that  anatomically  the  central  pathways  mediating  EDR  provide 
numerous  sources  of  Influence  upon  the  observed  reflex.  As  a  practical  indicator  of  CNS  function  the 
enormous  volume  of  GSR  literature  published  emphasizes  the  fact  that  the  GSR  pattern  produced  is  the 
result  of  multiple  influences  at  the  CNS  level.  To  date  these  influences  are  rather  poorly  identified 
and  their  separate  effects  on  GSR  patterns  are  not  clearly  distinguished. 

The  discussion  of  these  four  measures  referenced  to  CNS  activity  does  not  imply  that  other  measured 
may  not  be  of  equal  or  greater  value.  Additional  measures  deserving  consideration  include  blood  pressure, 
pulse  wave  velocity,  EMG  (electromyography),  eye  motion  (REM)  as  in  dream  studies,  and  pupillary  measures. 
Further  research  efforts  will  be  required  to  adequately  determine  the  usefulness  of  the  information  given 
by  each  measure  concerning  the  functional  state  of  the  central  nervous  system. 

Central  Nervous  System  Activation:  With  some  insight  into  measurement  procedures,  we  will  look  at  general 
levels  of  CNS  activation.  The  level  of  CNS  activation  or  arousal  relates  to  the  state  of  consciousness 
normally  ranging  from  deep  sleep  through  wakefulness  to  intense  arousal.  Obviously,  level  of  arousal  is 
influenced  by  many  factors  including  circadian  periodicity,  workload,  emotional  stimuli,  and  internal 
ideation.  From  a  neurophysiologic  standpoint,  the  state  of  consciousness  is  intimately  related  to  the 
activity  of  the  reticular  formation.  It,  in  turn,  may  be  influenced  strongly  by  afferent  motor  activity, 
the  amount  and  kind  of  sensory  stimulation,  and  the  emotional  state  of  the  individual. 

The  Reticular  Formation:  Anatomically,  the  reticular  formation  occu  es  a  central  location  in  the  brain 
stem  joining  the  cerebral  cortex  with  the  spinal  cord.  It  is  composed  of  a  network  of  interlaced  fibers 
and  contains  nuclei  surrounded  as  a  group  by  Che  primary  sensory  and  motor  pathways  connecting  the  cerebral 
cortex  with  the  spinal  cord.  The  central  cephalic  brain  stem  which  includes  the  diencephalic  and  mesen¬ 
cephalic  reticular  formation  is  essential  for  awareness  of  the  environment  and  voluntary  purpoaeful 
movement. 

In  terms  of  the  relationship  of  various  CNS  structures  to  conscious  activity  the  ascending  reticular 
activating  system  has  been  identified  as  having  great  functional  signifance.  Stimulation  of  the  anterior 
portion  of  the  reticular  formation  elicits  electrocortical  arousal  in  animals  and  has  been  used  to  produce 
wakefulness  in  human  analeptics.  This  same  anterior  portion  originates  impulses  distributed  widely  over 
most  the  cortical  surface,  particularly  the  association  areas.  It  is  interesting  to  note  that  there  is 
a  clear  distinction  between  the  mere  meaning  of  impulses  received  in  the  primary  sensory  receptor  areas  of 
the  cortex  and  the  meaningful,  purposeful  activity  evoked  by  concurrent  activation  of  association  areas, 
via  the  ascending  reticular  pathways.  Thus,  we  find  that  impulses  corresponding  to  a  visual  image  arriving 
in  the  primary  visual  receptor  area  remain  devoid  of  meaning  unless  the  adjacent,  visual  assocation  areas 
are  concurrently  activated.  The  arrival  of  sensory  Impulses  devoid  of  meaningful  association  is  charac¬ 
teristically  demonstrated  in  sleep  (or  dreamB) . 

The  complexity  of  the  relationship  between  the  cortex  and  the  reticular  formation  Is  emphasized  by 
the  important  role  played  by  the  descending  pathways  which  strongly  influence  the  core  of  the  brain  stem. 

It  is  through  these  cort.icifugal  pathways  that  emotional  arousal  and  goal  directed  behavior  of  conscious 
processes  are  mediated. 

The  limbic  System:  The  functionally  related  neural  structure  called  the  limbic  system  surrounds  the 
attachment  of  the  cerebral  hemispheres  to  the  brain  stem.  This  system  is  positively  associated  with  the 
subjective  and  autonomic  motor  expression  of  emotion.  Recordings  of  the  electrical  activity  within  the 
llmoic  structures  have  revealed  two  patterns  of  electrical  discharge  associated  with  excited  behavior. 

This  kind  of  behavior  apparently  involves  the  reinforcement  mechanisms  of  the  limbic  system  which  serves 
both  to  increase  the  amplitude  and  to  generalize  the  distribution  of  activity  in  other  parts  of  the  brain, 
including  the  reticular  formation.  Any  information  from  physiologic  measurements  indicating  the  level  of 
activation  of  the  reticular  formation  should  be  helpful  in  determining  the  behavioral  level  of  conscious¬ 
ness.  We  have  previously  noted  the  investigative  window  provided  by  pupilography  into  the  relative  state 
of  the  ascending  reticular  formation.  Addltonal  physiologic  patterns  or  identification  of  activity  which 
would  serve  to  indicate  the  contribution  or  involvement  of  the  limbic  sytem  to  arousal  would  help  to 
determine  its  emotional  component.  At  the  present  stage  of  development  of  biomedical  monitoring  most 
reports  of  physiologic  measures  obtained  under  stress  present  them  in  the  form  of  average  or  integrated 
values  over  relatively  long  periods  of  time.  The  results  generally  correlate  with  trends  in  the  level 
of  CNS  arousal.  However,  there  ia  increasing  evidence  that  detailed  analysis  of  differences  in  the 
central  processing  of  CNS  responses  in  going  to  be  evidenced  by  relative  oaall  transient  changes  in  the 
EEG.  These  will  be  related  to  small  changes  in  heart  rate  and  electrodermal  responses  which  together  will 
provide  more  specific  and  reliable  indicators  of  CNS  arousal  and  the  functional  state  of  the  central 
nervous  system. 

EEG  Indicators;  One  of  the  most  thoroughly  studied  features  of  the  normal  EEG  ia  the  8  to  13  cycles  per 
seconds  alpha  rhythm  often  observed  most  clearly  over  the  occipital  cortex.  The  complexity  and  multi¬ 
variant  nature  of  this  rhythm  has  been  demonstrated,  to  the  extent  that  it  is  possible  to  distinguish 
individual;,  who  demonstrate  either  persist  alpha,  responsive  alpha  or  absence  of  alpha.  However,  it  is 
necessary  to  reclize  that  this  is  not  a  fixed  classification  and  that  individual  alpha  patterns  actually 
encompass  a  continum  in  which  "persistent"  end  "absent"  types  represent  ends  of  the  scale.  It  has  been 
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found  that  the  sensory  sodality  of  the  imagery  characteristically  employed  by  the  Individual  largely 
determines  his  alpha  rhythm.  Visual  imagery  is  associated  with  the  absence  of  alpha,  while  non-visual, 
or  auditory  and  tactile  Imagery  Is  associated  with  persistent  alpha  and  responsive  or  fluctuating  alpha 
patterns  are  related  to  variance  in  the  individual's  imagery  modality.  This  highly  variable  expression 
of  alpha  rhythm  is  further  complicated  by  the  fact  that  what  appears  as  simple  rhythm  on  a  primary  trace 
is  really  often  a  complex  of  frequencies  from  multiple  sources.  Finally,  It  is  known  that  alpha  may  be 
absent  due  to  nonspecific  stress  effects  as  seen  In  chronic  neurotic  anxiety  states. 

When  present  alpha  rhythm  appears  most  predominately  In  the  relaxed,  eyes  closed,  awake  condition. 
However,  It  is  possible  to  train  a  human  being  to  produce  predominant  alpha  with  his  eyes  open,  while 
fully  awake,  and  fully  conscious,  and  fully  mobile.  The  disturbance  or  replacement  of  the  predominant 
alpha  frequency  is  called  "alpha  block"  and  is  seen  as  a  low  voltage,  higher  frequency  pattern.  This  is 
characteristic  of  an  attentive  state  or  alerting  response.  Although  alpha  rhythm  responds  sensiti"ely 
to  a  number  of  features  of  CNS  function,  so  many  factors  are  involved  that  no  simple  unambiguous  conclusion 
can  usually  be  drawn  from  the  presence  or  absence  of  alpha  rhythm  alone. 

Generally,  increased  cortical  arousal  is  associated  with  an  EEG  of  lower  voltage  and  higher  frequency. 
Sleep  or  certain  drugs  act  to  produce  a  slowing  of  EEG  frequencies,  as  doss  training  and  the  controlled 
relaxation  response  of  Jacobsen.  In  deep  sleep,  three  waves  per  second  are  seen,  the  so-called  delta 
waves.  In  moderate  sleep,  sleep  spindles  or  bursts  of  14  per  second  waves  occur,  breaming  is  associated 
with  rapid  eye  movements  (REM)  and  takes  place  in  the  range  of  drowsy  to  light  sleep,  the  so-called 
emergent/Stage  II  type  sleep.  The  term  emergent  is  used  to  indicate  a  stage  of  sleep  occurring  following 
of  period  of  one  of  the  deeper  stages  of  sleep.  This  phenomena  will  take  place  periodically  through  the 
night  with  many  individuals  exhibiting  a  particular  sleep  pattern  unique  to  them  alone.  No  dream  activity 
is  known  to  take  place  in  delta  sleep. 

Additional  studies  are  needed  to  establish  a  clear  relationship  between  physiological  measures, 
performance,  and  the  level  of  CNS  arousal  in  the  drowsiness-extreme  alertness  continuum  of  wakefulness. 

In  one  studv,  it  was  observed  that  there  were  high  levels  of  major  period  counts  over  ten  second  epochs 

during  increased  levels  of  arousal  and  lower  major  period  counts  during  decreased  periods  of  consciousness. 

Tne  converse  was  generally  true  of  the  minor  period  count.  Spectral  analysis  in  this  particular  study 

showed  quantitatively  the  shift  of  the  major  period  modal  band  to  slower  frequencies  and  a  shift  of  the 

minor  period  modal  band  to  faster  frequencies  with  decreasing  arousal  as  sleep  became  deeper. 

The  combination  of  both  frequency  and  amplitude  domains  in  the  activation  analysis  previously 
described,  shows  its  sensitivity  to  some  situations  while  indicating  some  ambiguity  as  a  pimple  arousal 
indicator.  Johnson  and  Ulett  (50)  examined  50  college  students  on  three  occasions  using  a  modified  EEG 
spectrum  analyzer.  Each  subject  was  examined  three  successive  occasions  under  quiet,  eyes-closed  con¬ 
ditions.  Average  values  of  the  frequency  spectrum  for  all  students  grouped  by  visits  produced  three 
curves  of  comparable  frequency  distribution;  however,  the  curve  corresponding  to  the  first  visit  was 
approximately  half  the  amplitude  observed  on  subsequent  visits.  The  authors  concluded  that  the  Increased 
anxiety  level  of  the  subjects  generated  by  their  apprehension  of  the  intial  EEG  examination  produced  this 
depression  in  amplitude  at  all  frequencies.  This  shows  that  in  this  group  of  subjects,  a  decrease  in 
anxiety  was  observable  as  a  decrease  in  EEG  amplitude  at  all  observed  frequencies  between  three  ard  33 
cycles  per  second.  The  activation  analysis  which  is  sensitive  to  such  amplitude  changes  may  be  a  useful 
indicator  of  the  anxiety  level  of  an  individual.  The  precise  manner  in  which  anxiety  effects  CNS  function 
and  the  resultant  level  of  performance  is  not  yet  clear,  but  is  obviously  a  significant  contributing  factor 
in  some  stressful  situations. 

In  a  fatigue  study  at  the  USAF  School  of  Aerospace  Medicine,  four  pilots  were  required  to  complete 
a  24  hour  simulator  flight  with  only  a  two  hour  refueling  stop  in  the  middle  of  the  run.  An  activation 
analysis  of  a  continuous  recorded  EEG  obtained  on  one  of  the  flights  showed  a  sustained  high  level  of  high 
frequency,  low  amplitude  activity  during  the  first  several  hours.  This  corresponded  to  the  period  of 
expressed  anxiety  on  the  part  of  the  pilot  as  to  his  ability  to  perform  adequately  on  the  simulator. 
Interestingly,  his  first  landing  rated  as  one  of  the  poorest  of  the  eleven  made  during  the  24  hours. 

Toward  the  end  of  the  flight  a  generally  lower  level  of  activation  level  was  observed  with  a  marked 
tendency  to  fluctuate  erratically  between  moderate  and  low  levels.  Thus,  it  is  seen  that  this  method  of 
EEG  analysis  promises  to  contribute  useful  information  regarding  the  general  level  and  fluctuations  of  CNS 
arousal. 

As  ve  have  indicated,  it  1b  probably  a  fair  statement  to  make  that  at  the  moment  there  is  little 
promise  of  new  and  exciting  use  of  ongoing  EEG  material  for  the  enhancement  of  pilot  performance.  In 
general,  we  can  tell  when  a  subject  is  getting  drowsy,  has  gone  to  sleep,  or,  to  a  lesser  degree  of 
certainty,  is  simply  inattentive.  So  we  are  left  with  Inferring  general  state  changes  and  its  usefulness 
for  monitoring  the  state  of  the  organism.  However,  the  event  related  potential  called  EKP  (or  Cortical 
Evoked  Response)  is  another  matter.  Before  discussing  the  ERF,  the  work  of  Donchin,  £t.  al.  has  identified 
several  interesting  electronic  signatures  indicative  of  cortical  activity  (51).  The  first  of  these  is 
called  N."00  and  this  electronic  component  is  elicited  whenever  a  rare  or  unexpected  event  occurs.  Another 
of  these  is  P300  and  this  endogenous  component  is  seen  in  association  with  task  revelant,  rare  stimuli. 
Another  component  is  the  contingent  negative  variation  (CNV).  This  wave  form  is  a  slow  negative  shift  of 
potential  tuat  occurs  during  the  warned  fore-period  preceeding  a  motor  or  mental  task.  It  begins  very 
shortly  after  the  warning  stimulus  and  terminates  after  a  response  decision  by  the  subject  or  the  occurence 
of  a  stimulus  which  demands  a  response.  The  final,  easily  Identified  wave  form  is  a  readiness  potential, 
the  RP.  This  is  similar  to  the  CNV  in  that  it  is  an  event-proceeding  negative  shift.  It  is  distinct  from 
the  CNV  in  a  sense  that  it  appears  prior  to  self-paced  voluntary  responses.  It's  occurrence  is  independent 
of  the  presence  of  an  eliciting  or  conmand  stimulus.  These  endogenous  components  of  the  braxn  wave  have 
been  studied  in  connection  with  arousal,  attention,  selective  attention,  emotional  valence,  assessment  of 
novelty,  time  estimation,  uncertainty,  detection  of  targets,  differential  identification  of  stimuli  inde¬ 
pendent  of  size  and  shpae,  and  the  semantic  classification  of  linguistic  symbols  (52). 
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Electrocardiogram  and  Respiratory  Indicators:  In  similar  studies  of  performance  the  average  values  of 
ECG  and  respiration  have  consistently  correlated  well  with  general  arousal  level.  The  highest  values  are 
seen  when  performance  demands  and/or  external  stressors,  such  as  threat  or  emergencies  are  introduced. 

This  -elationship  between  averc-e  h“-rt  rate  the  level  anxiety  was  nicely  demonstrated  in  the  highly 
slgnifican.  "Ties  of  experiments  by  Walter  (53).  Over  a  period  of  four  years,  he  performed  a  series  of 
"ompieta  deff  alve-avoidance  condition -.ng  ptucedureo  with  58  subjects,  37  normal  and  21  psychiatric 

Ht  collected  sufficient  evidence  in  these  experiments  to  distinguish  two  types  of  relations 
between  average  rate  and  ble-'d  so  indicated  by  pulse-wave  velocity.  One  relationship 

showed  t.  cover  Lati.-n  l  ettt  ~st-'  at.'  Mot--.'  pressure  in  the  initial  stages  of  excitement  in  normal 

subjects  and  in  ti.e  cumulativ-  vf  e>.;>sriaental  stress  in  disturbed  patients.  This  response  was 

linked  with  other  signs  of  generalized  tension  and  anxiety  and  vis  associated  with  adaptive  failure  or 
confusion.  The  other  type  of  response  shoved  an  inverse  relationship,  that  is  when  heart  rate  Increased, 
blood  pressure  fell.  This  was  a  transient  effect  showing  blood  pressure  changes  of  about  one-half  the 
magnitude  of  the  first  type  of  response.  This  second  type  of  response  was  frequently  elicited  by  the 
penalty  tone  which  was  indictative  of  an  erroneous  response.  This  is  an  excellent  example  of  Increased 
resolution  and  reliability  of  interpretation  afforded  by  the  observation  of  simultaneous  changes  in  two 
related  physiological  variables. 

Ax  (54)  has  reported  a  steady  decrease  in  the  mean  value  of  the  ratio  of  respiratory  to  nonresplratory 
coupling  in  five  subjects  undergoing  123  hours  of  sleeplessness.  This  serves  to  indicate  that  the  ratio  of 
the  length  of  time  that  the  record  is  characterized  by  undlaturbed  respiration-coupled  heart  rhythm  com¬ 
pared  to  cardioaccelerator  reflex  rhythm  relates  to  CNS  function  under  the  stress  of  sleep  deprevatlon. 

This  is  also  an  example  of  change  in  the  peripheral  expression  of  a  central  nervous  reflex  activity  related 
to  changes  in  levels  of  arousal. 

Electrodermal  Response  Indicators;  We  have  discussed  two  measures  of  electrodermal  response,  the  BSR  and 
the  GSR,  which  are  observed  to  change  with  the  general  level  of  CNS  arousal.  Levy  (55)  found  that  BSR 
compressed  on  a  five  centimeter  per  hour  write-out  was  particularly  valuable  in  monitoring  states  of 
consciousness.  Under  standard  conditions  he  found  the  pattern  of  one  individual’s  skin  response  to  be 
consistently  similar.  However,  the  patterns  of  different  subjects  varied  from  an  almost  straight  line  to 
a  wildly  fluctuating  one.  The  flat  stable  line  which  he  obtained  was  consistently  of  low  resistance  due 
to  frequent  small  amplitude  GSRs.  The  more  variable  tracings  were  of  higher  average  resistance  showing 
less  frequent,  often  large  GSRs  which  tended  to  occur  in  groups.  He  also  observed  that  persons  who 
exhibited  the  low  flat  type  of  basic  waking  pattern  seem  to  be  able  to  maintain  a  more  continuous  and 
higher  level  of  involvement  in  their  environment  than  those  persons  showing  a  more  variable  tracing.  In 
general,  then,  he  reports  a  relatively  stable,  low  value  of  BSR  during  aroused  wakefulness,  a  more  variable 
saw-tooth  pattern  during  drowsiness,  and  a  high  resistance  pattern  during  sleep. 

Similar  changes  in  BSR  have  been  noted  while  monitoring  pilots  during  flight.  Here,  resistance  is 
initially  low  when  the  pilot  starts  flying  the  aircraft  and  gradually  Increases  as  he  relaxes.  His 
resistance  drops  if  the  co-pilot  takes  control  of  the  plane  and  is  lowest  when  the  co-pilot  is  active  in 
stall-type  approach  for  landing.  I  suspect  we  would  see  the  same  response  in  a  husband  as  his  wife  takes 
over  driving  down  the  turnpike. 

In  his  series  of  conditioning  procedures,  Halter  observed  that  an  abundance  of  nonspecific  GSRs  was 
associated  with  muscular  tension,  slight  tachycardia,  raised  blood  pressure,  and  some  EEG  irregularities 
making  up  the  familiar  syndrome  of  tension/anxiety  which  constitutes  one  form  of  CNS  arousal.  These  and 
other  studies  all  report  similar  findings  which  indicate  that  the  BSR,  the  number  of  specific  GSRs,  and 
the  amplitude  of  specific  responses  when  properly  interpreted  can  Indicate  the  general  level  of  CNS  arousal. 

Individual  Central  Nervous  System  Responses:  Having  considered  indicators  of  general  CNS  activity,  we 
need  to  turn  to  individual  CNS  responses,  since  a  significant  portion  of  CNS  activity  concerns  reflex 
responses  to  stimuli.  Many  reflex  responses  are  sufficiently  complex  to  involve  a  major  portion  of  the 
suprasegmenta)  CNS.  The  qualitative  and  quantitive  identification  of  ongoing  reflex  response  patterns 
should  contribute  greatly  to  an  understanding  of  th;  functional  status  of  the  CNS  at  a  particular  time. 

Those  CNS  reflexes  which  have  been  identified  include  the  adaptive  reflex  connected  with  the  direction  of 
a  change  of  stimulus,  the  defensive  reflex  :.n  respome  to  a  stimulus  too  strong  for  normal  functioning, 
and  the  reflex  responses  per  se ,  much  evidence  concerning  central  nervous  system  function  can  be  gardnered 
from  patterns  of  evoked  responses  and  contingent  effects.  These  latter  responses  require  a  specific 
applied  stimulus  of  which  the  subject  is  aware  and  which  tends  to  be  distracting  or  alerting.  Evoked 
responses  may  provide  valuable  guidelines  for  the  Interpretation  of  reflex  response  patterns  observed  in 
stressful  situations  or  response  patterns  disturbed  by  specific  activity  in  the  environment. 

The  Orienting  Reflex:  The  orientating  reflex  is  of  particular  interest.  This  reflex,  first  identified 
by  Pavlov,  has  been  the  focus  of  an  extensive  research  program  in  the  Soviet  Union  and  has  been  the 
subject  of  many  annual  conferences.  This  reflex  is  characterized  as  an  unspecific  response  initiated  by 
any  Increase,  decrease,  or  qualitative  change  of  a  stimulus  independent  of  its  modality.  It  is  really 
the  "what  is  it?"  reflex  of  the  central  nervous  system.  It  only  acts  to  alert  and  prepare  the  individual 
for  action.  It  does  not  Itself  initiate  any  action  and  is  subject  to  extinction  or  habituation  quite 
easily  by  repeated  presentation  of  the  same  stimuli. 

Two  forr i  of  the  orienting  reflex  have  been  Identified:  (1)  a  generalized  orienting  and,  (2)  a 
localized  orienting.  For  example,  the  initial  presentation  of  a  tactile  stimulus  produces  a  generalised 
response  Including  an  alpha  block  in  the  occipital  and  motor  regions  of  the  cortex,  a  GSR,  an  increase 
in  muscle  tension  via  EMG  measurement,  an  eye  movement,  and  a  respiratory  pause.  After  a  few  dozen 
representations,  the  only  response  which  may  be  observed  would  be  a  transient  alpha  block  in  the  motor 
region  of  the  oertex.  Here,  the  other  components  of  the  reaction  have  been  inhibited,  transforming  the 
original  reflex  picture  to  a  localized  or  more  specific  reflex.  The  total  general  orienting  reflex 
picture  also  includes  increase  in  heart  rate,  vasoconstriction  of  finger  vessels,  and  vasodilation  of  the 
hand  vessels.  It  is  interesting  to  note  that  when  we  change  the  total  sensory  input,  by  adding  an 


additional  stimulus  to  the  now  habituated  specific  response,  the  generalized  response  is  once  again 
elicited.  This  demonstrates  the  preadaptlve,  rather  than  the  adaptive  nature  of  the  orienting  reflex. 

One  component  of  judgement  and  alertness  includes  the  degree  to  which  the  Individual  is  asking 
questions  of,  and  interacting  with  his  environment.  While  the  frequency  and  magnitude  of  orienting 
reflexes  may  provide  valuable  indications  of  this  Interaction,  it  is  fair  to  state  that  it  Is  presently 
difficult  to  differentiate  these  effects  from  the  general  level  of  CNS  arousal.  Theoretical  considerations 
and  some  preliminary  reports  suggest  that  it  may  be  possible  to  demonstrate  greater  specificity  of  indi¬ 
vidual  CNS  reflexes.  At  least  this  is  the  desired  direction  for  further  research  which  is  aimed  at 
permitting  clear  distinction  of  CNS  arousal  to  fear,  anger,  curiosity,  and  so  forth. 

EEG  Indicators:  As  we  have  noted,  the  identification  of  the  cortical  components  of  central  nervous  system 
reflexes  have  depended  primarily  upon  observation  of  alpha  block  indicating  cortical  arousal.  We  have 
also  noted  the  presence  of  distinctive  frequency  shifts  with  cortical  arousal  even  when  the  initial 
cortical  rhythm  is  other  than  alpha.  However,  unaided  visual  interruption,  or  other  gross  measures  of 
the  EEG,  do  not  permit  easy  Identif icav 7.0,-.  _>f  these  changes,  and  thus.  Increasing  attention  has  be"? 
focused  upon  the  various  methods  of  examln  r  the  EEG  in  a  more  microscopic  fashion. 

The  use  of  toposcopical  analysis  of  EEG  records  shows  interesting  contingent  effects  of  so-called 
"social"  versus  "defensive"  conditioning  of  alpha  rhythms.  Walter  was  able  to' demonstrate  divergent 
changes  in  alpha  when  a  subject  was  performing  a  task  in  cooperation  with  the  experimenter's  Instructions, 
that  is  social  conditioning,  as  compared  to  being  thrown  on  his  own  resources  to  solve  problems  pised  by 
the  experimenter,  that  is  defensive  conditioning. 

We  have  explored  the  cardiac,  respiratory  and  electrodermal  indicators  of  individual  CNS  arousal  and 
other  activities,  and  their  relationships  to  each  other,  and  can  now  turn  to  a  more  specific  type  of 
electrocortlcal  activity  which  promises  to  give  us  a  ^reat  deal  of  Information  in  assessing  human  mental 
processing  activity. 

While  it  is  well  known  and  accepted  that  the  task  of  pilotage  and  airborne  systems  controllers  has 
changed  dramatically  from  "seat-of-the-pants"  type  flying  to  sophisticated  monitoring,  pattern  recognition 
and  decision  making,  we  are  yet  unable  to  Identify,  much  less  quantify,  such  mental  processes.  Neverthe¬ 
less,  as  we  have  indicated,  recent  research  shows  that  certain  mental  acts  are  related  to  specific 
electronically  identifiable  wave  forms  as  well  as  to  changes  in  related  physiological  parameters.  Since 
such  factors  as  fatigue,  workload  and  stress  (physiologic  as  well  us  psychologic)  affect  mental  performance, 
it  is  highly  desirable  to  be  able  to  identify  and  quantify  such  measures .  The  main  thrust  of  this  research 
is  the  identification  of  specific  cortical  responses  or  response  patterns  evoked  by  specific  stimuli. 

These  event-related  potentials  can  be  characterized  as  an  EEG  response  wave  form,  having  both  positive 
and  negative  values,  with  certain  amplitudes  and  specific  latencies  and  duration  times.  In  studying  these 
potentials,  a  series  of  positive  and  negative  deflections  is  averaged  for  a  group  of  trials.  This 
characteristic  wave-form  signature,  elicited  by  a  specific  stimulus,  can  be  conceptually  and  empirically 
divided  into  two  categories.  The  earlier  components,  those  occuring  in  the  first  100  milliseconds  or  so, 
subsequent  to  the  stimulus,  are  referred  to  as  exogenous.  These  exogenous  components,  reflect  charac¬ 
teristics  intrinsic  to  the  stimulus  event  itself,  such  as  loudness,  brightness,  intensity  or  other 
psychophysical  attributes.  This  activity  is  considered  to  represent  the  processing  of  sensory  information. 
The  latter  components,  up  to  perhaps  600  milliseconds  beyond  the  stimulus,  are  considered  to  be  endogenous. 
These  endogenous  components  reflect  cognitive  processes  and  attributes  of  the  stimulus  deriving  not  from 
Its  physical  properties,  but  from  its  task-revelent  context.  As  Lawrence  (56)  in  an  unpublished  paper 
states,  "it  is  these  latter  components,  reflecting  aspects  of  performance  potentially  applicable  to  cockpit 
or  crew  station  situations  which  are  of  primary  interest." 

Lawrence  speculates  on  a  computer  controlled  workload  allocation  in  a  pilotage  situation  wherein 
"the  use  of  brainwaves  for  the  automatic  enhancement  of  warning  effectiveness  could  occur  in  two  ways.  A 
computer  could  sense  some  deficit  in  an  operator's  state  of  being,  or  potential  deficit  (anticipating  a 
crisis)  by  making  Inferences  about  operator  state  from  a  set  of  physiological  Information  channels.  The 
computer  could  also  observe  a  lack  of  attention  to  a  warning  display,  01  other  performance  deficit,  and 
take  action  to  somehow  stimulate  the  human  operator.  Here,  it  would  make  inferences  about  observed 
deficits  in  operate r  performance,  probably  from  assessment  of  ERPs,  or  their  lack,  in  response  to  warning 
signal  displays  used  functionally  as  probes.  There  exists  the  potential  for  use  of  brain  wave  indicators 
of  dangerous  operator  state  or  behavior  in  that  the  presence  of  theta  can  predict  drowsiness  and  deteri¬ 
oration  of  performace  and  of  course  the  sleeping  state  can  be  readily  detected.  The  detection  of 
undesirable  levels  of  arousal,  that  is,  inappropriately  high  or  low,  or  undesirable  emotional  states  can 
probably  be  enhanced  through  the  use  of  EEG  information  in  conjunction  with  that  obtained  through  other 
physiological  or  behavioral  channels.  Attention  to  a  display  could  be  assessed  by  a  steady  state  ERP 
technique.  ERPs  could  also  be  used  to  distinguish  nonresponse  to  a  warning  signal  resulting  from  a 
purposeful  decision  to  ignore  it,  from  accidental  nonrecognition.  This  way,  the  hypothesized  computer- 
based  system  could  refrain  from  repetition  or  intensification  of  information  which  the  operator  has 
already  processed  and  to  which  he  presumably  will  eventually  respond.  Using  brain  wave  information,  a 
computer  could  also  determine  the  occurance  of  an  event  like  target  acquisition  and  utilize  this  informa¬ 
tion  better  by  employing  a  built-in  control  loop  than  a  human  observer  could  by  employing  a  gross  skeletal 
response  such  as  pushing  a  button."  It  must  be  recognized  with  the  advances  of  high  speed,  high  perror- 
mance  aircraft  there  are  circumstances  which  exist  or  will  soon  exist  where  a  few  milliseconds  could 
provide  an  important  advantage.  This  is  especially  true  if  one  considers  the  small  single  savings  in 
time  and  effort  that  would  be  accumulated  over  a  rapidly  successive  series  of  events  in  a  continuing 
recycling  context  of  swift  decision  making  and  swift  response. 

As  Lawrence  points  out,  a  more  proximal  goal  would  be  the  development  of  machine  ability  to  sense 
such  general  lntaglbles  as  operator  uncertainty  and  the  need  for  more  information  or  a  need  to  maintain 
certain  decision  options  and  an  upgrading  of  information  relative  to  a  particular  pilot's  role  in  an 
overall  mission.  Here,  Instantaneous,  qualitative  feedback  to  the  machine  could  be  given  in  the  same  way 
that  varying  intensities  of  temperature  guide  a  mUsile  toward  a  heat  source.  The  ability  to  sense  these 
variables  continuously  and  sensitively  would  provide  the  basis  for  the  very  fine  control  of  machine  by 
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nan,  perhaps  even  along  the  line  of  the  creation  of  an  artiflcally  Intelligent  servoaechanlan  so  closely 
responsive  In  real  tine  ot  the  operator's  cognitions  and  perceptions  that  It  could  serve  virtually  as  a 
functional  extension  of  his  own  nervous  system.  It  would  seem  that  as  we  conputer  assist  the  functional 
machine  we  must  also  arrange  to  computer  assist  the  functioning  human  being  as  the  operator  of  that 
man-machine  system.  With  this  development,  the  problems  of  workload,  performance,  and  stress  would 
undoubtedly  be  resolved  and  laid  to  rest  for  once  and  for  all. 
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SUMtARY 


The  first  thras  chapters  provide  a  conceptual  framework  for  workload,  fatigue,  and  stress,  within 
which  to  evaluata  the  remainder  of  this  report.  In  each  case,  the  authors  attempted  to  be  brief,  to 
present  a  "capsule"  statement  of  different  definitions  and  orientations,  and  to  the  extent  possible  to 
prevent  their  own  biases  from  entering  Into  the  text.  What  is  the  probability  that  all  readers  will  be 
fully  satisfied  with  the  contents  of  the  three  chapters?  Probably  minimal,  but  hopefully  few  readers 
will  be  grossly  dissatisfied. 

The  next  three  chapters,  tsken  as  a  single  unit,  give  a  picture  of  the  workload  arena  In  a  broad 
sense,  partly  historical  and  partly  In  terns  of  specific  sub-problens  and  suggestions  regarding  selected 
methods  or  measureaents.  These  chapters,  therefore,  augment  the  conceptual  framework  provided  b”.  .the 
first  three  chapters.  y|| 

Chapters  7,  8,  and  9  cone  to  grips  In  a  concrete  way  with  the  critical  issue  of  the  anatomy  of  work¬ 
load  measurement  technology.  Chapter  7  provides  a  schema  (a  generalised  representation  on  framework  of  a 
topic  or  problem  area  derived  through  an  analytic  but  pragmatic  process)  for  workload  research.  Chapter 
8  describes  a  moderately  less  encompassing  b-it  still  global  program  dealgn  applied  to  workload  problems 
by  one  laboratory.  Chapter  9  presents  one  modelling  approach  to  workload — there  are  others,  of  course, 
as  the  author  points  out.  As  an  aside.  Chapter  9  is  also  a  "preview"  of  an  AGARDograph  which  the  Aero¬ 
space  Medical  Panel  is  considering  sponsoring  in  the  near  future.  These  three  chapters  are  recommended 
particularly  to  laboratory  directors,  program  directors,  and  supervisory  scientists  as  tools  for  evalua¬ 
tion  and  goal-setting  in  their  own  programs  in  workload  research. 

Chapters  10  through  18  deal  with  selected  measures  applied  to  specific  problems  In  specified  settings. 
The  first  six  are  concerned  with  aircrew  studies  and  the  last  three  with  air  traffic  control  studies. 

There  are  many  such  sets  of  studies  which  could  have  appeared  in  this  part  of  this  report.  These  appear 
because  they  were  offered  and  because  we,  the  editors,  valued  both  the  investigator  and  the  work  he 
reported.  In  each  case,  the  reader  will  be  able  to  see  how  one  Investigation  approached  one  specific 
problem  using  his  own  skills  and  the  resources  available  to  him.  The  virtue  of  thin  is  that  It  lets  the 
reader  move  from  "frameworks,"  "schemas,"  etc.,  to  concrete  examples. 

Chapter  19  stands  by  itself  In  this  document.  It  is  a  modest  compendium  in  which  some  measures  from 
some  domains  (e.g.,  psychophysiology)  are  described  and  critiqued,  the  critiques  clearly  Influenced  by 
the  skills,  experiences,  .d  biases  of  the  author.  The  term  "modest"  la  used  to  make  a  point.  A  compen¬ 
dium  like  this  could  be  .  probably  useful  and  certainly  very  long  handbook — probably  two  or  three  weighty 
volumes.  The  working  „roup  initially  tasked  itself  with  this  objective,  proposing  to  use  a  draft  hand¬ 
book  offered  by  a  US  colleague,  but  it  became  apparent  very  early  that  the  tauX  was  beyond  the  working 
group's  capabilities  (time,  level  of  effort,  etc.).  This  might  be  a  useful  <V:ure  task  for  Aerospace 
Medical  Panel  sponsorship,  though  probably  not  in  the  conventional  weeki/op  g;’,ap  mode  of  operation. 

r 

Two  points  should  be  made  in  concluding.  First,  all  papers  aftyr  t  V.,'tet  contain  pieces  of 
studies,  some  data,  analyses,  findings,  and  so  forth.  The  editors  beli  .'e  this'  enriches  the  more  global 
parts  of  each  chapter.  Second,  there  are  references  given  at  the  end  of  each  chapter.  Taker,  together — 
as  a  package — these  constitute  a  highly  useful  bibliography. 

Workload  measurement  methodology  is  in  a  continuing  process  of  unfolding,  acquiring  new  technology 
and  instrumentation,  moving  into  new  measurement  domains.  This  topic  should  be  revisited  by  AGARD  after 
a  reasonable  period  of  rest  and  recovery. 
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